{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "[![Logo Optimus](https://raw.githubusercontent.com/ironmussa/Optimus/master/images/optimus-logo.png)](https://hioptimus.com) "
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "[![PyPI version](https://badge.fury.io/py/optimuspyspark.svg)](https://badge.fury.io/py/optimuspyspark) [![Build Status](https://travis-ci.org/ironmussa/Optimus.svg?branch=master)](https://travis-ci.org/ironmussa/Optimus) [![Documentation Status](https://readthedocs.org/projects/optimus-ironmussa/badge/?version=latest)](http://optimus-ironmussa.readthedocs.io/en/latest/?badge=latest)  [![built_by iron](https://img.shields.io/badge/built_by-iron-FF69A4.svg)](http://ironmussa.com) [![Updates](https://pyup.io/repos/github/ironmussa/Optimus/shield.svg)](https://pyup.io/repos/github/ironmussa/Optimus/)  [![GitHub release](https://img.shields.io/github/release/ironmussa/optimus.svg)](https://github.com/ironmussa/Optimus/) \n",
    "[![Codacy Badge](https://api.codacy.com/project/badge/Grade/02b3ba0fe2b64d6297c6b8320f8b15a7)](https://www.codacy.com/app/argenisleon/Optimus?utm_source=github.com&amp;utm_medium=referral&amp;utm_content=ironmussa/Optimus&amp;utm_campaign=Badge_Grade)\n",
    "[![Coverage Status](https://coveralls.io/repos/github/ironmussa/Optimus/badge.svg?branch=master)](https://coveralls.io/github/ironmussa/Optimus?branch=master) [![Mentioned in Awesome Data Science](https://awesome.re/mentioned-badge.svg)](https://github.com/bulutyazilim/awesome-datascience)![Discord](https://img.shields.io/discord/579030865468719104.svg)\n",
    "\n",
    "[![Downloads](https://pepy.tech/badge/optimuspyspark)](https://pepy.tech/project/optimuspyspark)\n",
    "[![Downloads](https://pepy.tech/badge/optimuspyspark/month)](https://pepy.tech/project/optimuspyspark/month)\n",
    "[![Downloads](https://pepy.tech/badge/optimuspyspark/week)](https://pepy.tech/project/optimuspyspark/week)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 1,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Python 3.7.6\n"
     ]
    }
   ],
   "source": [
    "!python --version"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 3,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "'3.4.5'"
      ]
     },
     "execution_count": 3,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "import nltk\n",
    "nltk.__version__"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "To launch a live notebook server to test optimus using binder or Colab, click on one of the following badges:\n",
    "\n",
    "[![Binder](https://mybinder.org/badge.svg)](https://mybinder.org/v2/gh/ironmussa/Optimus/master)\n",
    "[![Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/ironmussa/Optimus/blob/master/examples/10_min_from_spark_to_pandas_with_optimus.ipynb)\n",
    "\n",
    "PyOptimus is the missing framework to profile, clean, process and plot small and big data. PyOptimus run over pandas, Dask, cuDF, Dask-cuDF, Spark, Vaex so you can use any of this librearies withou changing your code.\n",
    "\n",
    "## Installation (pip):  \n",
    "  \n",
    "In your terminal just type  `pip install pyoptimus`\n",
    "\n",
    "### Requirements\n",
    "* Python>=3.8  \n",
    "\n",
    "For the engines\n",
    "* RAPIDS >= 0.19\n",
    "* Dask >= 2021.2.0\n",
    "* Vaex >= 4.1\n",
    "* Apache Spark >= 21.8 \n",
    "* Ibis(WIP)\n",
    "\n",
    "## Why PyOptimus\n",
    "\n",
    "Why so many engines\n",
    "Every engine has\n",
    "\n",
    "## Examples \n",
    "\n",
    "You can go to the 10 minutes to Optimus [notebook](https://github.com/ironmussa/Optimus/blob/master/examples/10_min_from_spark_to_pandas_with_optimus.ipynb) where you can find the basic to start working. \n",
    "\n",
    "Also you can go to the [examples](examples/) folder to found specific notebooks about data cleaning, data wrangling, profiling and how to create ML.\n",
    "\n",
    "Besides check the [Cheat Sheet](https://htmlpreview.github.io/?https://github.com/ironmussa/Optimus/blob/master/docs/cheatsheet/optimus_cheat_sheet.html) "
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Documentation\n",
    "  \n",
    "[![Documentation](https://media.readthedocs.com/corporate/img/header-logo.png)](http://docs.hioptimus.com/en/latest/)  \n",
    "  \n",
    "## Feedback \n",
    "Feedback is what drive Optimus future, so please take a couple of minutes to help shape the Optimus' Roadmap:  http://bit.ly/optimus_survey  \n",
    "\n",
    "Also if you want to a suggestion or feature request use https://github.com/ironmussa/optimus/issues\n",
    " \n",
    "## Start Optimus"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 1,
   "metadata": {},
   "outputs": [],
   "source": [
    "%load_ext autoreload\n",
    "%autoreload 2"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 2,
   "metadata": {},
   "outputs": [],
   "source": [
    "import sys\n",
    "sys.path.append(\"..\")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 3,
   "metadata": {},
   "outputs": [
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "C:\\Users\\argenisleon\\Anaconda3\\lib\\site-packages\\socks.py:58: DeprecationWarning: Using or importing the ABCs from 'collections' instead of from 'collections.abc' is deprecated, and in 3.8 it will stop working\n",
      "  from collections import Callable\n",
      "\n",
      "    You are using PySparkling of version 2.4.10, but your PySpark is of\n",
      "    version 2.3.1. Please make sure Spark and PySparkling versions are compatible. \n",
      "INFO:optimus:Operative System:Windows\n",
      "INFO:optimus:Just check that Spark and all necessary environments vars are present...\n",
      "INFO:optimus:-----\n",
      "INFO:optimus:SPARK_HOME=C:\\opt\\spark\\spark-2.3.1-bin-hadoop2.7\n",
      "INFO:optimus:HADOOP_HOME=C:\\opt\\hadoop-2.7.7\n",
      "INFO:optimus:PYSPARK_PYTHON=C:\\Users\\argenisleon\\Anaconda3\\python.exe\n",
      "INFO:optimus:PYSPARK_DRIVER_PYTHON=jupyter\n",
      "INFO:optimus:PYSPARK_SUBMIT_ARGS=--jars \"file:///C:/Users/argenisleon/Documents/Optimus/optimus/jars/RedshiftJDBC42-1.2.16.1027.jar,file:///C:/Users/argenisleon/Documents/Optimus/optimus/jars/mysql-connector-java-8.0.16.jar,file:///C:/Users/argenisleon/Documents/Optimus/optimus/jars/ojdbc8.jar,file:///C:/Users/argenisleon/Documents/Optimus/optimus/jars/postgresql-42.2.5.jar\" --driver-class-path \"C:/Users/argenisleon/Documents/Optimus/optimus/jars/RedshiftJDBC42-1.2.16.1027.jar;C:/Users/argenisleon/Documents/Optimus/optimus/jars/mysql-connector-java-8.0.16.jar;C:/Users/argenisleon/Documents/Optimus/optimus/jars/ojdbc8.jar;C:/Users/argenisleon/Documents/Optimus/optimus/jars/postgresql-42.2.5.jar\" --conf \"spark.sql.catalogImplementation=hive\" pyspark-shell\n",
      "INFO:optimus:JAVA_HOME=C:\\java\n",
      "INFO:optimus:Pyarrow Installed\n",
      "INFO:optimus:-----\n",
      "INFO:optimus:Starting or getting SparkSession and SparkContext...\n",
      "INFO:optimus:Spark Version:2.3.1\n",
      "INFO:optimus:\n",
      "                             ____        __  _                     \n",
      "                            / __ \\____  / /_(_)___ ___  __  _______\n",
      "                           / / / / __ \\/ __/ / __ `__ \\/ / / / ___/\n",
      "                          / /_/ / /_/ / /_/ / / / / / / /_/ (__  ) \n",
      "                          \\____/ .___/\\__/_/_/ /_/ /_/\\__,_/____/  \n",
      "                              /_/                                  \n",
      "                              \n",
      "INFO:optimus:Transform and Roll out...\n",
      "INFO:optimus:Optimus successfully imported. Have fun :).\n",
      "INFO:optimus:Config.ini not found\n"
     ]
    },
    {
     "data": {
      "text/html": [
       "<style> /* Tables*/\n",
       "\n",
       " .data_type {\n",
       "        font-size: 0.8em;\n",
       "        font-weight: normal;\n",
       "    }\n",
       "\n",
       "    .column_name {\n",
       "        font-size: 1.2em;\n",
       "    }\n",
       "\n",
       "    .info_items {\n",
       "        margin: 10px 0;\n",
       "        font-size: 0.8em;\n",
       "    }\n",
       "\n",
       "    .optimus_table td {\n",
       "        border: 0px;\n",
       "    }\n",
       "\n",
       "    .optimus_table tr:nth-child(even) {\n",
       "        background-color: #f2f2f2 !important;\n",
       "    }\n",
       "\n",
       "    .optimus_table tr:nth-child(odd) {\n",
       "        background-color: #ffffff !important;\n",
       "    }\n",
       "\n",
       "    .optimus_table thead {\n",
       "        border-bottom: 1px solid black;\n",
       "    }\n",
       "    .optimus_table{\n",
       "        font-size: 12px;\n",
       "    }\n",
       "\n",
       "    .optimus_table tbody{\n",
       "        font-family: monospace;\n",
       "        border-bottom: 1px solid #cccccc;\n",
       "    }\n",
       "\n",
       "    /* Profiler */\n",
       "        .main{\n",
       "        width:100%;\n",
       "        overflow:auto;\n",
       "        border-bottom:1px solid #eeeeee;\n",
       "        padding: 10px 0;\n",
       "    }\n",
       "    .panel_profiler{\n",
       "        margin-right:2%;\n",
       "        float:left;\n",
       "        padding-bottom:2%;\n",
       "    }\n",
       "    .panel_profiler tbody{\n",
       "        font-family:monospace;\n",
       "    }\n",
       "    .title_profiler{\n",
       "        padding:20px;\n",
       "        background-color: #eeeeee\n",
       "    }\n",
       "    .info{\n",
       "        overflow: auto\n",
       "    }\n",
       "    .main td, main th{\n",
       "        padding:0em\n",
       "    }\n",
       "    .panel_profiler td {\n",
       "        padding:0.2em\n",
       "    }\n",
       "    .none, .true{\n",
       "        color:#0000ff\n",
       "    }\n",
       "    .optimus_table th {\n",
       "        font-family:sans-serif;\n",
       "    }\n",
       "\n",
       "    .info_items{\n",
       "        font-family:sans-serif;\n",
       "        font-size:10px;\n",
       "    }\n",
       "</style>"
      ],
      "text/plain": [
       "<IPython.core.display.HTML object>"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    }
   ],
   "source": [
    "from optimus import Optimus\n",
    "op= Optimus(verbose=True)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "You also can use an already created Spark session:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 4,
   "metadata": {},
   "outputs": [
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "INFO:optimus:\n",
      "                             ____        __  _                     \n",
      "                            / __ \\____  / /_(_)___ ___  __  _______\n",
      "                           / / / / __ \\/ __/ / __ `__ \\/ / / / ___/\n",
      "                          / /_/ / /_/ / /_/ / / / / / / /_/ (__  ) \n",
      "                          \\____/ .___/\\__/_/_/ /_/ /_/\\__,_/____/  \n",
      "                              /_/                                  \n",
      "                              \n",
      "INFO:optimus:Transform and Roll out...\n",
      "INFO:optimus:Optimus successfully imported. Have fun :).\n",
      "INFO:optimus:Config.ini not found\n"
     ]
    },
    {
     "data": {
      "text/html": [
       "<style> /* Tables*/\n",
       "\n",
       " .data_type {\n",
       "        font-size: 0.8em;\n",
       "        font-weight: normal;\n",
       "    }\n",
       "\n",
       "    .column_name {\n",
       "        font-size: 1.2em;\n",
       "    }\n",
       "\n",
       "    .info_items {\n",
       "        margin: 10px 0;\n",
       "        font-size: 0.8em;\n",
       "    }\n",
       "\n",
       "    .optimus_table td {\n",
       "        border: 0px;\n",
       "    }\n",
       "\n",
       "    .optimus_table tr:nth-child(even) {\n",
       "        background-color: #f2f2f2 !important;\n",
       "    }\n",
       "\n",
       "    .optimus_table tr:nth-child(odd) {\n",
       "        background-color: #ffffff !important;\n",
       "    }\n",
       "\n",
       "    .optimus_table thead {\n",
       "        border-bottom: 1px solid black;\n",
       "    }\n",
       "    .optimus_table{\n",
       "        font-size: 12px;\n",
       "    }\n",
       "\n",
       "    .optimus_table tbody{\n",
       "        font-family: monospace;\n",
       "        border-bottom: 1px solid #cccccc;\n",
       "    }\n",
       "\n",
       "    /* Profiler */\n",
       "        .main{\n",
       "        width:100%;\n",
       "        overflow:auto;\n",
       "        border-bottom:1px solid #eeeeee;\n",
       "        padding: 10px 0;\n",
       "    }\n",
       "    .panel_profiler{\n",
       "        margin-right:2%;\n",
       "        float:left;\n",
       "        padding-bottom:2%;\n",
       "    }\n",
       "    .panel_profiler tbody{\n",
       "        font-family:monospace;\n",
       "    }\n",
       "    .title_profiler{\n",
       "        padding:20px;\n",
       "        background-color: #eeeeee\n",
       "    }\n",
       "    .info{\n",
       "        overflow: auto\n",
       "    }\n",
       "    .main td, main th{\n",
       "        padding:0em\n",
       "    }\n",
       "    .panel_profiler td {\n",
       "        padding:0.2em\n",
       "    }\n",
       "    .none, .true{\n",
       "        color:#0000ff\n",
       "    }\n",
       "    .optimus_table th {\n",
       "        font-family:sans-serif;\n",
       "    }\n",
       "\n",
       "    .info_items{\n",
       "        font-family:sans-serif;\n",
       "        font-size:10px;\n",
       "    }\n",
       "</style>"
      ],
      "text/plain": [
       "<IPython.core.display.HTML object>"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    }
   ],
   "source": [
    "from pyspark.sql import SparkSession\n",
    "from optimus import Optimus\n",
    "\n",
    "spark = SparkSession.builder.appName('optimus').getOrCreate()\n",
    "op= Optimus(spark)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Loading data\n",
    "Now Optimus can load data in csv, json, parquet, avro, excel from a local file or URL."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 5,
   "metadata": {},
   "outputs": [
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "INFO:optimus:Downloading foo.json from https://raw.githubusercontent.com/ironmussa/Optimus/master/examples/data/foo.json\n",
      "INFO:optimus:Downloaded 2596 bytes\n",
      "INFO:optimus:Creating DataFrame for foo.json. Please wait...\n",
      "INFO:optimus:Successfully created DataFrame for 'foo.json'\n"
     ]
    }
   ],
   "source": [
    "#csv\n",
    "df = op.load.csv(\"../examples/data/foo.csv\")\n",
    "\n",
    "#json\n",
    "# Use a local file\n",
    "df = op.load.json(\"../examples/data/foo.json\")\n",
    "\n",
    "# Use a url\n",
    "df = op.load.json(\"https://raw.githubusercontent.com/ironmussa/Optimus/master/examples/data/foo.json\")\n",
    "\n",
    "# parquet\n",
    "df = op.load.parquet(\"../examples/data/foo.parquet\")\n",
    "\n",
    "# avro\n",
    "# df = op.load.avro(\"../examples/data/foo.avro\").table(5)\n",
    "\n",
    "# excel \n",
    "df = op.load.excel(\"../examples/data/titanic3.xls\")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Also you can load data from oracle, redshift, mysql and postgres. See ***Database connection***"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Saving Data"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 6,
   "metadata": {},
   "outputs": [
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "INFO:optimus:`pclass`,`survived`,`name`,`sex`,`age`,`sibsp`,`parch`,`ticket`,`fare`,`cabin`,`embarked`,`boat`,`body`,`home_dest` column(s) was not processed because is/are not date,array,vector,binary,null\n",
      "INFO:optimus:`pclass`,`survived`,`name`,`sex`,`age`,`sibsp`,`parch`,`ticket`,`fare`,`cabin`,`embarked`,`boat`,`body`,`home_dest` column(s) was not processed because is/are not null\n"
     ]
    }
   ],
   "source": [
    "#csv\n",
    "df.save.csv(\"data/foo.csv\")\n",
    "\n",
    "# json\n",
    "df.save.json(\"data/foo.json\")\n",
    "\n",
    "# parquet\n",
    "df.save.parquet(\"data/foo.parquet\")\n",
    "\n",
    "# avro\n",
    "#df.save.avro(\"examples/data/foo.avro\")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Also you can save data to oracle, redshift, mysql and postgres. See ***Database connection***"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Handling Spark jars, packages and repositories"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "With optimus is easy to loading jars, packages and repos. You can init optimus/spark like "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 7,
   "metadata": {},
   "outputs": [
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "INFO:optimus:Operative System:Windows\n",
      "INFO:optimus:Just check that Spark and all necessary environments vars are present...\n",
      "INFO:optimus:-----\n",
      "INFO:optimus:SPARK_HOME=C:\\opt\\spark\\spark-2.3.1-bin-hadoop2.7\n",
      "INFO:optimus:HADOOP_HOME=C:\\opt\\hadoop-2.7.7\n",
      "INFO:optimus:PYSPARK_PYTHON=C:\\Users\\argenisleon\\Anaconda3\\python.exe\n",
      "INFO:optimus:PYSPARK_DRIVER_PYTHON=jupyter\n",
      "INFO:optimus:PYSPARK_SUBMIT_ARGS=--repositories myrepo --packages org.apache.spark:spark-avro_2.12:2.4.3 --jars \"my.jar,file:///C:/Users/argenisleon/Documents/Optimus/optimus/jars/RedshiftJDBC42-1.2.16.1027.jar,file:///C:/Users/argenisleon/Documents/Optimus/optimus/jars/mysql-connector-java-8.0.16.jar,file:///C:/Users/argenisleon/Documents/Optimus/optimus/jars/ojdbc8.jar,file:///C:/Users/argenisleon/Documents/Optimus/optimus/jars/postgresql-42.2.5.jar\" --driver-class-path \"this_is_a_jar_class_path.jar;C:/Users/argenisleon/Documents/Optimus/optimus/jars/RedshiftJDBC42-1.2.16.1027.jar;C:/Users/argenisleon/Documents/Optimus/optimus/jars/mysql-connector-java-8.0.16.jar;C:/Users/argenisleon/Documents/Optimus/optimus/jars/ojdbc8.jar;C:/Users/argenisleon/Documents/Optimus/optimus/jars/postgresql-42.2.5.jar\" --conf \"spark.sql.catalogImplementation=hive\" pyspark-shell\n",
      "INFO:optimus:JAVA_HOME=C:\\java\n",
      "INFO:optimus:Pyarrow Installed\n",
      "INFO:optimus:-----\n",
      "INFO:optimus:Starting or getting SparkSession and SparkContext...\n",
      "INFO:optimus:Spark Version:2.3.1\n",
      "INFO:optimus:\n",
      "                             ____        __  _                     \n",
      "                            / __ \\____  / /_(_)___ ___  __  _______\n",
      "                           / / / / __ \\/ __/ / __ `__ \\/ / / / ___/\n",
      "                          / /_/ / /_/ / /_/ / / / / / / /_/ (__  ) \n",
      "                          \\____/ .___/\\__/_/_/ /_/ /_/\\__,_/____/  \n",
      "                              /_/                                  \n",
      "                              \n",
      "INFO:optimus:Transform and Roll out...\n",
      "INFO:optimus:Optimus successfully imported. Have fun :).\n",
      "INFO:optimus:Config.ini not found\n"
     ]
    },
    {
     "data": {
      "text/html": [
       "<style> /* Tables*/\n",
       "\n",
       " .data_type {\n",
       "        font-size: 0.8em;\n",
       "        font-weight: normal;\n",
       "    }\n",
       "\n",
       "    .column_name {\n",
       "        font-size: 1.2em;\n",
       "    }\n",
       "\n",
       "    .info_items {\n",
       "        margin: 10px 0;\n",
       "        font-size: 0.8em;\n",
       "    }\n",
       "\n",
       "    .optimus_table td {\n",
       "        border: 0px;\n",
       "    }\n",
       "\n",
       "    .optimus_table tr:nth-child(even) {\n",
       "        background-color: #f2f2f2 !important;\n",
       "    }\n",
       "\n",
       "    .optimus_table tr:nth-child(odd) {\n",
       "        background-color: #ffffff !important;\n",
       "    }\n",
       "\n",
       "    .optimus_table thead {\n",
       "        border-bottom: 1px solid black;\n",
       "    }\n",
       "    .optimus_table{\n",
       "        font-size: 12px;\n",
       "    }\n",
       "\n",
       "    .optimus_table tbody{\n",
       "        font-family: monospace;\n",
       "        border-bottom: 1px solid #cccccc;\n",
       "    }\n",
       "\n",
       "    /* Profiler */\n",
       "        .main{\n",
       "        width:100%;\n",
       "        overflow:auto;\n",
       "        border-bottom:1px solid #eeeeee;\n",
       "        padding: 10px 0;\n",
       "    }\n",
       "    .panel_profiler{\n",
       "        margin-right:2%;\n",
       "        float:left;\n",
       "        padding-bottom:2%;\n",
       "    }\n",
       "    .panel_profiler tbody{\n",
       "        font-family:monospace;\n",
       "    }\n",
       "    .title_profiler{\n",
       "        padding:20px;\n",
       "        background-color: #eeeeee\n",
       "    }\n",
       "    .info{\n",
       "        overflow: auto\n",
       "    }\n",
       "    .main td, main th{\n",
       "        padding:0em\n",
       "    }\n",
       "    .panel_profiler td {\n",
       "        padding:0.2em\n",
       "    }\n",
       "    .none, .true{\n",
       "        color:#0000ff\n",
       "    }\n",
       "    .optimus_table th {\n",
       "        font-family:sans-serif;\n",
       "    }\n",
       "\n",
       "    .info_items{\n",
       "        font-family:sans-serif;\n",
       "        font-size:10px;\n",
       "    }\n",
       "</style>"
      ],
      "text/plain": [
       "<IPython.core.display.HTML object>"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    }
   ],
   "source": [
    "op= Optimus(repositories = \"myrepo\", packages=\"org.apache.spark:spark-avro_2.12:2.4.3\", jars=\"my.jar\", driver_class_path=\"this_is_a_jar_class_path.jar\", verbose= True)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Create dataframes"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "lines_to_next_cell": 0
   },
   "source": [
    "Also you can create a dataframe from scratch"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 8,
   "metadata": {},
   "outputs": [],
   "source": [
    "from pyspark.sql.types import *\n",
    "from datetime import date, datetime\n",
    "\n",
    "df = op.create.df(\n",
    "    [\n",
    "        (\"names\", \"str\", True), \n",
    "        (\"height(ft)\",\"int\", True), \n",
    "        (\"function\", \"str\", True), \n",
    "        (\"rank\", \"int\", True), \n",
    "        (\"age\",\"int\",True),\n",
    "        (\"weight(t)\",\"float\",True),\n",
    "        (\"japanese name\", ArrayType(StringType()), True),\n",
    "        (\"last position seen\", \"str\", True),\n",
    "        (\"date arrival\", \"str\", True),\n",
    "        (\"last date seen\", \"str\", True),\n",
    "        (\"attributes\", ArrayType(FloatType()), True),\n",
    "        (\"DateType\"),\n",
    "        (\"Tiemstamp\"),\n",
    "        (\"Cybertronian\", \"bool\", True), \n",
    "        (\"NullType\", \"null\", True),\n",
    "    ],\n",
    "    [\n",
    "        (\"Optim'us\", 28, \"Leader\", 10, 5000000, 4.3, [\"Inochi\", \"Convoy\"], \"19.442735,-99.201111\", \"1980/04/10\",\n",
    "         \"2016/09/10\", [8.5344, 4300.0], date(2016, 9, 10), datetime(2014, 6, 24), True,\n",
    "         None),\n",
    "        (\"bumbl#ebéé  \", 17, \"Espionage\", 7, 5000000, 2.0, [\"Bumble\", \"Goldback\"], \"10.642707,-71.612534\", \"1980/04/10\",\n",
    "         \"2015/08/10\", [5.334, 2000.0], date(2015, 8, 10), datetime(2014, 6, 24), True,\n",
    "         None),\n",
    "        (\"ironhide&\", 26, \"Security\", 7, 5000000, 4.0, [\"Roadbuster\"], \"37.789563,-122.400356\", \"1980/04/10\",\n",
    "         \"2014/07/10\", [7.9248, 4000.0], date(2014, 6, 24), datetime(2014, 6, 24), True,\n",
    "         None),\n",
    "        (\"Jazz\", 13, \"First Lieutenant\", 8, 5000000, 1.80, [\"Meister\"], \"33.670666,-117.841553\", \"1980/04/10\",\n",
    "         \"2013/06/10\", [3.9624, 1800.0], date(2013, 6, 24), datetime(2014, 6, 24), True, None),\n",
    "        (\"Megatron\", None, \"None\", 10, 5000000, 5.70, [\"Megatron\"], None, \"1980/04/10\", \"2012/05/10\", [None, 5700.0],\n",
    "         date(2012, 5, 10), datetime(2014, 6, 24), True, None),\n",
    "        (\"Metroplex_)^$\", 300, \"Battle Station\", 8, 5000000, None, [\"Metroflex\"], None, \"1980/04/10\", \"2011/04/10\",\n",
    "         [91.44, None], date(2011, 4, 10), datetime(2014, 6, 24), True, None),\n",
    "\n",
    "    ], infer_schema = True).ext.h_repartition(1)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "With .table() you have a beautifull way to show your data. You have extra information like column number, column data type and marked white spaces \n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 9,
   "metadata": {
    "lines_to_next_cell": 2
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Loading page (1/2)\n",
      "Rendering (2/2)                                                    \n",
      "Done                                                               \n"
     ]
    },
    {
     "data": {
      "text/html": [
       "<img src='images/table.png'>"
      ],
      "text/plain": [
       "<IPython.core.display.HTML object>"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    }
   ],
   "source": [
    "df.table_image(\"images/table.png\")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Also you can create a dataframe from a Pandas dataframe"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 10,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Loading page (1/2)\n",
      "Rendering (2/2)                                                    \n",
      "Done                                                               \n"
     ]
    },
    {
     "data": {
      "text/html": [
       "<img src='images/pandas.png'>"
      ],
      "text/plain": [
       "<IPython.core.display.HTML object>"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    }
   ],
   "source": [
    "import pandas as pd\n",
    "pdf = pd.DataFrame({'A': {0: 'a', 1: 'b', 2: 'c',3:'d'},\n",
    "                    'B': {0: 1, 1: 3, 2: 5,3:7},\n",
    "                       'C': {0: 2, 1: 4, 2: 6,3:None},\n",
    "                       'D': {0:'1980/04/10',1:'1980/04/10',2:'1980/04/10',3:'1980/04/10'},\n",
    "                       })\n",
    "\n",
    "s_pdf = op.create.df(pdf=pdf)\n",
    "s_pdf.table_image(\"images/pandas.png\")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Cleaning and Processing\n",
    "  \n",
    "Optimus V2 was created to make data cleaning a breeze. The API was designed to be super easy to newcomers and very familiar for people that comes from Pandas.\n",
    "Optimus expands the Spark DataFrame functionality adding .rows and .cols attributes.\n",
    "\n",
    "For example you can load data from a url, transform and apply some predefined cleaning functions:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 11,
   "metadata": {},
   "outputs": [
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "INFO:optimus:Using 'column_exp' to process column 'names' with function _lower\n",
      "INFO:optimus:Using 'column_exp' to process column 'function' with function _lower\n",
      "INFO:optimus:Using 'column_exp' to process column 'date arrival' with function _date_transform\n",
      "INFO:optimus:Using 'column_exp' to process column 'date arrival' with function _years_between\n",
      "INFO:optimus:Using 'column_exp' to process column 'from arrival' with function _cast_to\n",
      "INFO:optimus:Using 'pandas_udf' to process column 'names' with function _remove_accents\n",
      "INFO:optimus:Using 'pandas_udf' to process column 'names' with function multiple_replace\n",
      "INFO:optimus:`japanese name`,`attributes`,`tiemstamp`,`nulltype` column(s) was not processed because is/are not byte,short,big,int,double,float,string,date,bool\n",
      "INFO:optimus:Using 'column_exp' to process column 'names' with function _trim\n",
      "INFO:optimus:Using 'column_exp' to process column 'height(ft)' with function _trim\n",
      "INFO:optimus:Using 'column_exp' to process column 'function' with function _trim\n",
      "INFO:optimus:Using 'column_exp' to process column 'rank' with function _trim\n",
      "INFO:optimus:Using 'column_exp' to process column 'age' with function _trim\n",
      "INFO:optimus:Using 'column_exp' to process column 'weight(t)' with function _trim\n",
      "INFO:optimus:Using 'column_exp' to process column 'last position seen' with function _trim\n",
      "INFO:optimus:Using 'column_exp' to process column 'date arrival' with function _trim\n",
      "INFO:optimus:Using 'column_exp' to process column 'last date seen' with function _trim\n",
      "INFO:optimus:Using 'column_exp' to process column 'datetype' with function _trim\n",
      "INFO:optimus:Using 'column_exp' to process column 'cybertronian' with function _trim\n",
      "INFO:optimus:Using 'column_exp' to process column 'new_age' with function _trim\n",
      "INFO:optimus:Using 'column_exp' to process column 'from arrival' with function _trim\n"
     ]
    }
   ],
   "source": [
    "# This is a custom function\n",
    "def func(value, arg):\n",
    "    return \"this was a number\"\n",
    "    \n",
    "new_df = df\\\n",
    "    .rows.sort(\"rank\",\"desc\")\\\n",
    "    .withColumn('new_age', df.age)\\\n",
    "    .cols.lower([\"names\",\"function\"])\\\n",
    "    .cols.date_transform(\"date arrival\", \"yyyy/MM/dd\", \"dd-MM-YYYY\")\\\n",
    "    .cols.years_between(\"date arrival\", \"dd-MM-YYYY\", output_cols = \"from arrival\")\\\n",
    "    .cols.remove_accents(\"names\")\\\n",
    "    .cols.remove_special_chars(\"names\")\\\n",
    "    .rows.drop(df[\"rank\"]>8)\\\n",
    "    .cols.rename(str.lower)\\\n",
    "    .cols.trim(\"*\")\\\n",
    "    .cols.unnest(\"japanese name\", output_cols=\"other names\")\\\n",
    "    .cols.unnest(\"last position seen\",separator=\",\", output_cols=\"pos\")\\\n",
    "    .cols.drop([\"last position seen\", \"japanese name\",\"date arrival\", \"cybertronian\", \"nulltype\"])\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "You transform this"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 12,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Loading page (1/2)\n",
      "Rendering (2/2)                                                    \n",
      "Done                                                               \n"
     ]
    },
    {
     "data": {
      "text/html": [
       "<img src='images/table1.png'>"
      ],
      "text/plain": [
       "<IPython.core.display.HTML object>"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    }
   ],
   "source": [
    "df.table_image(\"images/table1.png\")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Into this"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 13,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Loading page (1/2)\n",
      "Rendering (2/2)                                                    \n",
      "Done                                                               \n"
     ]
    },
    {
     "data": {
      "text/html": [
       "<img src='images/table2.png'>"
      ],
      "text/plain": [
       "<IPython.core.display.HTML object>"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    }
   ],
   "source": [
    "new_df.table_image(\"images/table2.png\")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Note that you can use Optimus functions and Spark functions(`.WithColumn()`) and all the df function availables in a Spark Dataframe at the same time. To know about all the Optimus functionality please go to this [notebooks](examples/)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Handling column output\n",
    "\n",
    "With Optimus you can handle how the output column from a transformation in going to be handled."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 14,
   "metadata": {},
   "outputs": [],
   "source": [
    "from pyspark.sql import functions as F\n",
    "\n",
    "def func(col_name, attr):\n",
    "    return F.upper(F.col(col_name))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "If a **string** is passed to **input_cols** and **output_cols** is not defined the result from the operation is going to be saved in the same input column"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 15,
   "metadata": {},
   "outputs": [
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "INFO:optimus:Using 'column_exp' to process column 'names' with function func\n"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Loading page (1/2)\n",
      "Rendering (2/2)                                                    \n",
      "Done                                                               \n"
     ]
    },
    {
     "data": {
      "text/html": [
       "<img src='images/column_output_1.png'>"
      ],
      "text/plain": [
       "<IPython.core.display.HTML object>"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    }
   ],
   "source": [
    "output_df = df.cols.apply(input_cols=\"names\", output_cols=None,func=func)\n",
    "output_df.table_image(\"images/column_output_1.png\")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "If a **string** is passed to **input_cols** and a **string** is passed to **output_cols** the output is going to be saved in the output column"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 16,
   "metadata": {},
   "outputs": [
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "INFO:optimus:Using 'column_exp' to process column 'names' with function func\n"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Loading page (1/2)\n",
      "Rendering (2/2)                                                    \n",
      "Done                                                               \n"
     ]
    },
    {
     "data": {
      "text/html": [
       "<img src='images/column_output_2.png'>"
      ],
      "text/plain": [
       "<IPython.core.display.HTML object>"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    }
   ],
   "source": [
    "output_df = df.cols.apply(input_cols=\"names\", output_cols=\"names_up\",func=func)\n",
    "output_df.table_image(\"images/column_output_2.png\")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "If a **list** is passed to **input_cols** and a **string** is passed to **out_cols** Optimus will concatenate the list with every element in the list to create a new column name with the output"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 17,
   "metadata": {},
   "outputs": [
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "INFO:optimus:Using 'column_exp' to process column 'names' with function func\n",
      "INFO:optimus:Using 'column_exp' to process column 'function' with function func\n"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Loading page (1/2)\n",
      "Rendering (2/2)                                                    \n",
      "Done                                                               \n"
     ]
    },
    {
     "data": {
      "text/html": [
       "<img src='images/column_output_3.png'>"
      ],
      "text/plain": [
       "<IPython.core.display.HTML object>"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    }
   ],
   "source": [
    "output_df = df.cols.apply(input_cols=[\"names\",\"function\"], output_cols=\"_up\",func=func)\n",
    "output_df.table_image(\"images/column_output_3.png\")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "If a **list** is passed to **input_cols** and a **list** is passed in **out_cols** Optimus will output every input column in the respective output column"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 18,
   "metadata": {},
   "outputs": [
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "INFO:optimus:Using 'column_exp' to process column 'names' with function func\n",
      "INFO:optimus:Using 'column_exp' to process column 'function' with function func\n"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Loading page (1/2)\n",
      "Rendering (2/2)                                                    \n",
      "Done                                                               \n"
     ]
    },
    {
     "data": {
      "text/html": [
       "<img src='images/column_output_4.png'>"
      ],
      "text/plain": [
       "<IPython.core.display.HTML object>"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    }
   ],
   "source": [
    "output_df = df.cols.apply(input_cols=[\"names\",\"function\"], output_cols=[\"names_up\",\"function_up\"],func=func)\n",
    "output_df.table_image(\"images/column_output_4.png\")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "lines_to_next_cell": 0
   },
   "source": [
    "### Custom functions\n",
    "Spark has multiple ways to transform your data like rdd, Column Expression, udf and pandas udf. In Optimus we created the `apply()` and `apply_expr` which handles all the implementation complexity.\n",
    "\n",
    "Here you apply a function to the \"billingid\" column. Sum 1 and 2 to the current column value. All powered by Pandas UDF"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 19,
   "metadata": {},
   "outputs": [
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "INFO:optimus:Using 'pandas_udf' to process column 'height(ft)' with function func\n"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Loading page (1/2)\n",
      "Rendering (2/2)                                                    \n",
      "Done                                                               \n"
     ]
    },
    {
     "data": {
      "text/html": [
       "<img src='images/table3.png'>"
      ],
      "text/plain": [
       "<IPython.core.display.HTML object>"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    }
   ],
   "source": [
    "def func(value, args):\n",
    "    return value + args[0] + args[1]\n",
    "\n",
    "df.cols.apply(\"height(ft)\",func,\"int\", [1,2]).table_image(\"images/table3.png\")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "If you want to apply a Column Expression use `apply_expr()` like this. In this case we pass an argument 10 to divide the actual column value"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 20,
   "metadata": {},
   "outputs": [
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "INFO:optimus:Using 'column_exp' to process column 'height(ft)' with function func\n"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Loading page (1/2)\n",
      "Rendering (2/2)                                                    \n",
      "Done                                                               \n"
     ]
    },
    {
     "data": {
      "text/html": [
       "<img src='images/table4.png'>"
      ],
      "text/plain": [
       "<IPython.core.display.HTML object>"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    }
   ],
   "source": [
    "from pyspark.sql import functions as F\n",
    "\n",
    "def func(col_name, args):\n",
    "    return F.col(col_name)/20\n",
    "\n",
    "df.cols.apply(\"height(ft)\", func=func, args=20).table_image(\"images/table4.png\")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "You can change the table output back to ascii if you wish"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 21,
   "metadata": {},
   "outputs": [],
   "source": [
    "op.output(\"ascii\")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "To return to HTML just:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 22,
   "metadata": {},
   "outputs": [],
   "source": [
    "op.output(\"html\")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Data profiling\n",
    "\n",
    "Optimus comes with a powerful and unique data profiler. Besides basic and advance stats like min, max, kurtosis, mad etc, \n",
    "it also let you know what type of data has every column. For example if a string column have string, integer, float, bool, date Optimus can give you an unique overview about your data. \n",
    "Just run `df.profile(\"*\")` to profile all the columns. For more info about the profiler please go to this [notebook](./examples/profiler.ipynb).\n",
    "\n",
    "Let's load a \"big\" dataset"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 23,
   "metadata": {},
   "outputs": [
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "INFO:optimus:Downloading Meteorite_Landings.csv from https://raw.githubusercontent.com/ironmussa/Optimus/master/examples/data/Meteorite_Landings.csv\n",
      "INFO:optimus:Downloaded 4978151 bytes\n",
      "INFO:optimus:Creating DataFrame for Meteorite_Landings.csv. Please wait...\n",
      "INFO:optimus:Successfully created DataFrame for 'Meteorite_Landings.csv'\n"
     ]
    }
   ],
   "source": [
    "df = op.load.csv(\"https://raw.githubusercontent.com/ironmussa/Optimus/master/examples/data/Meteorite_Landings.csv\").ext.h_repartition()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Numeric"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 24,
   "metadata": {},
   "outputs": [
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "INFO:optimus:Processing column 'mass (g)'...\n",
      "INFO:optimus:_count_data_types() executed in 2.31 sec\n",
      "INFO:optimus:count_data_types() executed in 2.31 sec\n",
      "INFO:optimus:cast_columns() executed in 0.0 sec\n",
      "INFO:optimus:agg_exprs() executed in 3.29 sec\n",
      "INFO:optimus:general_stats() executed in 3.3 sec\n",
      "INFO:optimus:------------------------------\n",
      "INFO:optimus:Processing column 'mass (g)'...\n",
      "INFO:optimus:frequency() executed in 4.77 sec\n",
      "INFO:optimus:stats_by_column() executed in 0.0 sec\n",
      "INFO:optimus:Using 'column_exp' to process column 'mass (g)' with function _cast_to\n",
      "INFO:optimus:percentile() executed in 0.39 sec\n",
      "INFO:optimus:Using 'column_exp' to process column 'mass (g)' with function _cast_to\n",
      "INFO:optimus:Using 'column_exp' to process column 'mass (g)' with function _cast_to\n",
      "INFO:optimus:extra_numeric_stats() executed in 0.9 sec\n",
      "INFO:optimus:Using 'column_exp' to process column 'mass (g)' with function _bucketizer\n",
      "INFO:optimus:bucketizer() executed in 0.47 sec\n",
      "INFO:optimus:hist() executed in 2.54 sec\n"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Including 'nan' as Null in processing 'name'\n",
      "Including 'nan' as Null in processing 'nametype'\n",
      "Including 'nan' as Null in processing 'recclass'\n",
      "Including 'nan' as Null in processing 'fall'\n",
      "Including 'nan' as Null in processing 'year'\n",
      "Including 'nan' as Null in processing 'GeoLocation'\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "INFO:optimus:dataset_info() executed in 2.28 sec\n"
     ]
    },
    {
     "data": {
      "text/html": [
       "\n",
       "<div class=\"title_profiler\">\n",
       "    <h1>Overview</h1>\n",
       "</div>\n",
       "<div class=\"main\">\n",
       "    <div class=\"panel_profiler\">\n",
       "        <h2>Dataset info</h2>\n",
       "        <table>\n",
       "            <tbody>\n",
       "            <tr>\n",
       "                <td>Number of columns</td>\n",
       "                <td>10</td>\n",
       "\n",
       "            </tr>\n",
       "            <tr>\n",
       "                <td>Number of rows</td>\n",
       "                <td>45716</td>\n",
       "\n",
       "            </tr>\n",
       "            <tr>\n",
       "                <td>Total Missing (%)</td>\n",
       "                <td>0.49%</td>\n",
       "\n",
       "            </tr>\n",
       "            <tr>\n",
       "                <td>Total size in memory</td>\n",
       "                <td>88.2 MB</td>\n",
       "\n",
       "            </tr>\n",
       "            </tbody>\n",
       "        </table>\n",
       "    </div>\n",
       "    <div class=\"panel_profiler\">\n",
       "        <h2>Column types</h2>\n",
       "        <table>\n",
       "            <tbody>\n",
       "            <tr>\n",
       "                <td>String</td>\n",
       "                <td>0</td>\n",
       "\n",
       "            </tr>\n",
       "            <tr>\n",
       "                <td>Numeric</td>\n",
       "                <td>1</td>\n",
       "\n",
       "            </tr>\n",
       "            <tr>\n",
       "                <td>Date</td>\n",
       "                <td>0</td>\n",
       "\n",
       "            </tr>\n",
       "            <tr>\n",
       "                <td>Bool</td>\n",
       "                <td>0</td>\n",
       "\n",
       "            </tr>\n",
       "            <tr>\n",
       "                <td>Array</td>\n",
       "                <td>0</td>\n",
       "\n",
       "            </tr>\n",
       "            <tr>\n",
       "                <td>Not available</td>\n",
       "                <td>0</td>\n",
       "\n",
       "            </tr>\n",
       "            </tbody>\n",
       "        </table>\n",
       "    </div>\n",
       "</div><link rel=\"stylesheet\" type=\"text/css\" href=\"optimus/styles/styles.css\">\n",
       "\n",
       "\n",
       "<div class=\"main\">\n",
       "    <div class=\"info\">\n",
       "\n",
       "        \n",
       "\n",
       "        <div class=\"panel_profiler\">\n",
       "            <div>\n",
       "                <h2>mass (g)</h2>\n",
       "                <span>numeric</span>\n",
       "            </div>\n",
       "            <table>\n",
       "                <tbody>\n",
       "                <tr>\n",
       "                    <td>Unique</td>\n",
       "                    <td> 12497</td>\n",
       "                </tr>\n",
       "                <tr>\n",
       "                    <td>Unique (%)</td>\n",
       "                    <td> 27.336</td>\n",
       "                </tr>\n",
       "                <tr>\n",
       "                    <td>Missing</td>\n",
       "                    <td>0.0</td>\n",
       "                </tr>\n",
       "                <tr>\n",
       "                    <td>Missing (%)</td>\n",
       "                    <td>0</td>\n",
       "                </tr>\n",
       "                </tbody>\n",
       "            </table>\n",
       "            <div>\n",
       "                <h3>\n",
       "                    Datatypes\n",
       "                </h3>\n",
       "            </div>\n",
       "            <table>\n",
       "                <tbody>\n",
       "                <tr>\n",
       "                    <td>\n",
       "                        String\n",
       "                    </td>\n",
       "                    <td>\n",
       "                        0\n",
       "                    </td>\n",
       "                </tr>\n",
       "                <tr>\n",
       "                    <td>\n",
       "                        Integer\n",
       "                    </td>\n",
       "                    <td>\n",
       "                        0\n",
       "                    </td>\n",
       "                </tr>\n",
       "                <tr>\n",
       "                    <td>\n",
       "                        Float\n",
       "                    </td>\n",
       "                    <td>\n",
       "                        0\n",
       "                    </td>\n",
       "                </tr>\n",
       "                <tr>\n",
       "                    <td>\n",
       "                        Bool\n",
       "                    </td>\n",
       "                    <td>\n",
       "                        0\n",
       "                    </td>\n",
       "                </tr>\n",
       "                <tr>\n",
       "                    <td>\n",
       "                        Date\n",
       "                    </td>\n",
       "                    <td>\n",
       "                        0\n",
       "                    </td>\n",
       "                </tr>\n",
       "                <tr>\n",
       "                    <td>\n",
       "                        Missing\n",
       "                    </td>\n",
       "                    <td>\n",
       "                        0\n",
       "                    </td>\n",
       "                </tr>\n",
       "                <tr>\n",
       "                    <td>\n",
       "                        Null\n",
       "                    </td>\n",
       "                    <td>\n",
       "                        131\n",
       "                    </td>\n",
       "\n",
       "                </tr>\n",
       "                </tbody>\n",
       "            </table>\n",
       "            \n",
       "            <div>\n",
       "                <h3>\n",
       "                    Basic Stats\n",
       "                </h3>\n",
       "\n",
       "            </div>\n",
       "            <table>\n",
       "                <tbody>\n",
       "                <tr>\n",
       "                    <td>Mean</td>\n",
       "                    <td>13278.07855</td>\n",
       "                </tr>\n",
       "                <tr>\n",
       "                    <td>Minimum</td>\n",
       "                    <td>0.0</td>\n",
       "                </tr>\n",
       "                <tr>\n",
       "                    <td>Maximum</td>\n",
       "                    <td>60000000.0</td>\n",
       "                </tr>\n",
       "                <tr>\n",
       "                    <td>Zeros(%)</td>\n",
       "                    <td></td>\n",
       "                </tr>\n",
       "\n",
       "                </tbody>\n",
       "            </table>\n",
       "            \n",
       "\n",
       "        </div>\n",
       "        <div class=\"panel_profiler\">\n",
       "            <h3>Frequency</h3>\n",
       "            <table>\n",
       "\n",
       "                <tr>\n",
       "                    <th>Value</th>\n",
       "                    <th>Count</th>\n",
       "                    <th>Frequency (%)</th>\n",
       "                </tr>\n",
       "                \n",
       "                <tr>\n",
       "                    <td>1.3</td>\n",
       "                    <td>171</td>\n",
       "                    <td>0.374%</td>\n",
       "                </tr>\n",
       "\n",
       "                \n",
       "                <tr>\n",
       "                    <td>1.2</td>\n",
       "                    <td>140</td>\n",
       "                    <td>0.306%</td>\n",
       "                </tr>\n",
       "\n",
       "                \n",
       "                <tr>\n",
       "                    <td>1.4</td>\n",
       "                    <td>138</td>\n",
       "                    <td>0.302%</td>\n",
       "                </tr>\n",
       "\n",
       "                \n",
       "                <tr>\n",
       "                    <td>None</td>\n",
       "                    <td>131</td>\n",
       "                    <td>0.287%</td>\n",
       "                </tr>\n",
       "\n",
       "                \n",
       "                <tr>\n",
       "                    <td>2.1</td>\n",
       "                    <td>130</td>\n",
       "                    <td>0.284%</td>\n",
       "                </tr>\n",
       "\n",
       "                \n",
       "                <tr>\n",
       "                    <td>2.4</td>\n",
       "                    <td>126</td>\n",
       "                    <td>0.276%</td>\n",
       "                </tr>\n",
       "\n",
       "                \n",
       "                <tr>\n",
       "                    <td>1.6</td>\n",
       "                    <td>120</td>\n",
       "                    <td>0.262%</td>\n",
       "                </tr>\n",
       "\n",
       "                \n",
       "                <tr>\n",
       "                    <td>0.5</td>\n",
       "                    <td>119</td>\n",
       "                    <td>0.26%</td>\n",
       "                </tr>\n",
       "\n",
       "                \n",
       "                <tr>\n",
       "                    <td>1.1</td>\n",
       "                    <td>116</td>\n",
       "                    <td>0.254%</td>\n",
       "                </tr>\n",
       "\n",
       "                \n",
       "                <tr>\n",
       "                    <td>3.8</td>\n",
       "                    <td>114</td>\n",
       "                    <td>0.249%</td>\n",
       "                </tr>\n",
       "\n",
       "                \n",
       "                <tr>\n",
       "                    <td>\"Missing\"</td>\n",
       "                    <td>0</td>\n",
       "                    <td>0.0%</td>\n",
       "                </tr>\n",
       "\n",
       "            </table>\n",
       "        </div>\n",
       "        \n",
       "\n",
       "        \n",
       "        <div class=\"panel_profiler\">\n",
       "\n",
       "\n",
       "            <h3>Quantile statistics</h3>\n",
       "            <table>\n",
       "                <tbody>\n",
       "                <tr>\n",
       "                    <td>Minimum</td>\n",
       "                    <td>0.0</td>\n",
       "                </tr>\n",
       "                <tr>\n",
       "                    <td>5-th percentile</td>\n",
       "                    <td></td>\n",
       "                </tr>\n",
       "                <tr>\n",
       "                    <td>Q1</td>\n",
       "                    <td></td>\n",
       "                </tr>\n",
       "                <tr>\n",
       "                    <td>Median</td>\n",
       "                    <td></td>\n",
       "                </tr>\n",
       "                <tr>\n",
       "                    <td>Q3</td>\n",
       "                    <td></td>\n",
       "                </tr>\n",
       "                <tr>\n",
       "                    <td>95-th percentile</td>\n",
       "                    <td></td>\n",
       "                </tr>\n",
       "                <tr>\n",
       "                    <td>Maximum</td>\n",
       "                    <td>60000000.0</td>\n",
       "                </tr>\n",
       "                <tr>\n",
       "                    <td>Range</td>\n",
       "                    <td>60000000.0</td>\n",
       "                </tr>\n",
       "                <tr>\n",
       "                    <td>Interquartile range</td>\n",
       "                    <td>0.0</td>\n",
       "                </tr>\n",
       "                </tbody>\n",
       "            </table>\n",
       "        </div>\n",
       "        <div class=\"panel_profiler\">\n",
       "            <h3>Descriptive statistics</h3>\n",
       "            <table>\n",
       "                <tbody>\n",
       "                <tr>\n",
       "                    <td>Standard deviation</td>\n",
       "                    <td>574988.87641</td>\n",
       "                </tr>\n",
       "                <tr>\n",
       "                    <td>Coef of variation</td>\n",
       "                    <td>43.30362</td>\n",
       "                </tr>\n",
       "                <tr>\n",
       "                    <td>Kurtosis</td>\n",
       "                    <td>6796.17061</td>\n",
       "                </tr>\n",
       "                <tr>\n",
       "                    <td>Mean</td>\n",
       "                    <td>13278.07855</td>\n",
       "                </tr>\n",
       "                <tr>\n",
       "                    <td>MAD</td>\n",
       "                    <td>0.0</td>\n",
       "                </tr>\n",
       "                <tr>\n",
       "                    <td>Skewness</td>\n",
       "                    <td></td>\n",
       "                </tr>\n",
       "                <tr>\n",
       "                    <td>Sum</td>\n",
       "                    <td>605281210.638</td>\n",
       "                </tr>\n",
       "                <tr>\n",
       "                    <td>Variance</td>\n",
       "                    <td>330612207995.7785</td>\n",
       "                </tr>\n",
       "                </tbody>\n",
       "            </table>\n",
       "        </div>\n",
       "        \n",
       "    </div>\n",
       "    <table>\n",
       "        \n",
       "        <tr>\n",
       "\n",
       "            <td>\n",
       "\n",
       "                <div>\n",
       "                    <img src=\"\">\n",
       "                </div>\n",
       "            </td>\n",
       "        </tr>\n",
       "        \n",
       "        \n",
       "        <tr>\n",
       "            <td>\n",
       "                <div>\n",
       "                    <img src=\"\">\n",
       "                </div>\n",
       "\n",
       "            </td>\n",
       "\n",
       "        </tr>\n",
       "        \n",
       "        \n",
       "        \n",
       "        \n",
       "        \n",
       "        \n",
       "\n",
       "    </table>\n",
       "</div>\n",
       "\n",
       "\n",
       "\n",
       "\n",
       "<div class=\"info_items\">Viewing 10 of 45716 rows / 10 columns</div>\n",
       "<div class=\"info_items\">32 partition(s)</div>\n",
       "\n",
       "<table class=\"optimus_table\">\n",
       "    <thead>\n",
       "    <tr>\n",
       "        \n",
       "        <th>\n",
       "            <div class=\"column_name\">name</div>\n",
       "            <div class=\"data_type\">1 (string)</div>\n",
       "            <div class=\"data_type\">\n",
       "                \n",
       "                nullable\n",
       "                \n",
       "            </div>\n",
       "        </th>\n",
       "        \n",
       "        <th>\n",
       "            <div class=\"column_name\">id</div>\n",
       "            <div class=\"data_type\">2 (int)</div>\n",
       "            <div class=\"data_type\">\n",
       "                \n",
       "                nullable\n",
       "                \n",
       "            </div>\n",
       "        </th>\n",
       "        \n",
       "        <th>\n",
       "            <div class=\"column_name\">nametype</div>\n",
       "            <div class=\"data_type\">3 (string)</div>\n",
       "            <div class=\"data_type\">\n",
       "                \n",
       "                nullable\n",
       "                \n",
       "            </div>\n",
       "        </th>\n",
       "        \n",
       "        <th>\n",
       "            <div class=\"column_name\">recclass</div>\n",
       "            <div class=\"data_type\">4 (string)</div>\n",
       "            <div class=\"data_type\">\n",
       "                \n",
       "                nullable\n",
       "                \n",
       "            </div>\n",
       "        </th>\n",
       "        \n",
       "        <th>\n",
       "            <div class=\"column_name\">mass (g)</div>\n",
       "            <div class=\"data_type\">5 (double)</div>\n",
       "            <div class=\"data_type\">\n",
       "                \n",
       "                nullable\n",
       "                \n",
       "            </div>\n",
       "        </th>\n",
       "        \n",
       "        <th>\n",
       "            <div class=\"column_name\">fall</div>\n",
       "            <div class=\"data_type\">6 (string)</div>\n",
       "            <div class=\"data_type\">\n",
       "                \n",
       "                nullable\n",
       "                \n",
       "            </div>\n",
       "        </th>\n",
       "        \n",
       "        <th>\n",
       "            <div class=\"column_name\">year</div>\n",
       "            <div class=\"data_type\">7 (string)</div>\n",
       "            <div class=\"data_type\">\n",
       "                \n",
       "                nullable\n",
       "                \n",
       "            </div>\n",
       "        </th>\n",
       "        \n",
       "        <th>\n",
       "            <div class=\"column_name\">reclat</div>\n",
       "            <div class=\"data_type\">8 (double)</div>\n",
       "            <div class=\"data_type\">\n",
       "                \n",
       "                nullable\n",
       "                \n",
       "            </div>\n",
       "        </th>\n",
       "        \n",
       "        <th>\n",
       "            <div class=\"column_name\">reclong</div>\n",
       "            <div class=\"data_type\">9 (double)</div>\n",
       "            <div class=\"data_type\">\n",
       "                \n",
       "                nullable\n",
       "                \n",
       "            </div>\n",
       "        </th>\n",
       "        \n",
       "        <th>\n",
       "            <div class=\"column_name\">GeoLocation</div>\n",
       "            <div class=\"data_type\">10 (string)</div>\n",
       "            <div class=\"data_type\">\n",
       "                \n",
       "                nullable\n",
       "                \n",
       "            </div>\n",
       "        </th>\n",
       "        \n",
       "    </tr>\n",
       "\n",
       "    </thead>\n",
       "    <tbody>\n",
       "    \n",
       "    <tr>\n",
       "        \n",
       "        <td>\n",
       "            <div class=\" \"\n",
       "                 title='Acfer&#8901;232'>Acfer&#8901;232\n",
       "            </div>\n",
       "        </td>\n",
       "        \n",
       "        <td>\n",
       "            <div class=\" \"\n",
       "                 title='240'>240\n",
       "            </div>\n",
       "        </td>\n",
       "        \n",
       "        <td>\n",
       "            <div class=\" \"\n",
       "                 title='Valid'>Valid\n",
       "            </div>\n",
       "        </td>\n",
       "        \n",
       "        <td>\n",
       "            <div class=\" \"\n",
       "                 title='H5'>H5\n",
       "            </div>\n",
       "        </td>\n",
       "        \n",
       "        <td>\n",
       "            <div class=\" \"\n",
       "                 title='725.0'>725.0\n",
       "            </div>\n",
       "        </td>\n",
       "        \n",
       "        <td>\n",
       "            <div class=\" \"\n",
       "                 title='Found'>Found\n",
       "            </div>\n",
       "        </td>\n",
       "        \n",
       "        <td>\n",
       "            <div class=\" \"\n",
       "                 title='01/01/1991&#8901;12:00:00&#8901;AM'>01/01/1991&#8901;12:00:00&#8901;AM\n",
       "            </div>\n",
       "        </td>\n",
       "        \n",
       "        <td>\n",
       "            <div class=\" \"\n",
       "                 title='27.73944'>27.73944\n",
       "            </div>\n",
       "        </td>\n",
       "        \n",
       "        <td>\n",
       "            <div class=\" \"\n",
       "                 title='4.32833'>4.32833\n",
       "            </div>\n",
       "        </td>\n",
       "        \n",
       "        <td>\n",
       "            <div class=\" \"\n",
       "                 title='(27.739440,&#8901;4.328330)'>(27.739440,&#8901;4.328330)\n",
       "            </div>\n",
       "        </td>\n",
       "        \n",
       "    </tr>\n",
       "    \n",
       "    <tr>\n",
       "        \n",
       "        <td>\n",
       "            <div class=\" \"\n",
       "                 title='Asuka&#8901;87197'>Asuka&#8901;87197\n",
       "            </div>\n",
       "        </td>\n",
       "        \n",
       "        <td>\n",
       "            <div class=\" \"\n",
       "                 title='2554'>2554\n",
       "            </div>\n",
       "        </td>\n",
       "        \n",
       "        <td>\n",
       "            <div class=\" \"\n",
       "                 title='Valid'>Valid\n",
       "            </div>\n",
       "        </td>\n",
       "        \n",
       "        <td>\n",
       "            <div class=\" \"\n",
       "                 title='H4'>H4\n",
       "            </div>\n",
       "        </td>\n",
       "        \n",
       "        <td>\n",
       "            <div class=\" \"\n",
       "                 title='124.99'>124.99\n",
       "            </div>\n",
       "        </td>\n",
       "        \n",
       "        <td>\n",
       "            <div class=\" \"\n",
       "                 title='Found'>Found\n",
       "            </div>\n",
       "        </td>\n",
       "        \n",
       "        <td>\n",
       "            <div class=\" \"\n",
       "                 title='01/01/1987&#8901;12:00:00&#8901;AM'>01/01/1987&#8901;12:00:00&#8901;AM\n",
       "            </div>\n",
       "        </td>\n",
       "        \n",
       "        <td>\n",
       "            <div class=\" \"\n",
       "                 title='-72.0'>-72.0\n",
       "            </div>\n",
       "        </td>\n",
       "        \n",
       "        <td>\n",
       "            <div class=\" \"\n",
       "                 title='26.0'>26.0\n",
       "            </div>\n",
       "        </td>\n",
       "        \n",
       "        <td>\n",
       "            <div class=\" \"\n",
       "                 title='(-72.000000,&#8901;26.000000)'>(-72.000000,&#8901;26.000000)\n",
       "            </div>\n",
       "        </td>\n",
       "        \n",
       "    </tr>\n",
       "    \n",
       "    <tr>\n",
       "        \n",
       "        <td>\n",
       "            <div class=\" \"\n",
       "                 title='Gladstone&#8901;(iron)'>Gladstone&#8901;(iron)\n",
       "            </div>\n",
       "        </td>\n",
       "        \n",
       "        <td>\n",
       "            <div class=\" \"\n",
       "                 title='10920'>10920\n",
       "            </div>\n",
       "        </td>\n",
       "        \n",
       "        <td>\n",
       "            <div class=\" \"\n",
       "                 title='Valid'>Valid\n",
       "            </div>\n",
       "        </td>\n",
       "        \n",
       "        <td>\n",
       "            <div class=\" \"\n",
       "                 title='Iron,&#8901;IAB-MG'>Iron,&#8901;IAB-MG\n",
       "            </div>\n",
       "        </td>\n",
       "        \n",
       "        <td>\n",
       "            <div class=\" \"\n",
       "                 title='736600.0'>736600.0\n",
       "            </div>\n",
       "        </td>\n",
       "        \n",
       "        <td>\n",
       "            <div class=\" \"\n",
       "                 title='Found'>Found\n",
       "            </div>\n",
       "        </td>\n",
       "        \n",
       "        <td>\n",
       "            <div class=\" \"\n",
       "                 title='01/01/1915&#8901;12:00:00&#8901;AM'>01/01/1915&#8901;12:00:00&#8901;AM\n",
       "            </div>\n",
       "        </td>\n",
       "        \n",
       "        <td>\n",
       "            <div class=\" \"\n",
       "                 title='-23.9'>-23.9\n",
       "            </div>\n",
       "        </td>\n",
       "        \n",
       "        <td>\n",
       "            <div class=\" \"\n",
       "                 title='151.3'>151.3\n",
       "            </div>\n",
       "        </td>\n",
       "        \n",
       "        <td>\n",
       "            <div class=\" \"\n",
       "                 title='(-23.900000,&#8901;151.300000)'>(-23.900000,&#8901;151.300000)\n",
       "            </div>\n",
       "        </td>\n",
       "        \n",
       "    </tr>\n",
       "    \n",
       "    <tr>\n",
       "        \n",
       "        <td>\n",
       "            <div class=\" \"\n",
       "                 title='Nullarbor&#8901;015'>Nullarbor&#8901;015\n",
       "            </div>\n",
       "        </td>\n",
       "        \n",
       "        <td>\n",
       "            <div class=\" \"\n",
       "                 title='17955'>17955\n",
       "            </div>\n",
       "        </td>\n",
       "        \n",
       "        <td>\n",
       "            <div class=\" \"\n",
       "                 title='Valid'>Valid\n",
       "            </div>\n",
       "        </td>\n",
       "        \n",
       "        <td>\n",
       "            <div class=\" \"\n",
       "                 title='L6'>L6\n",
       "            </div>\n",
       "        </td>\n",
       "        \n",
       "        <td>\n",
       "            <div class=\" \"\n",
       "                 title='3986.0'>3986.0\n",
       "            </div>\n",
       "        </td>\n",
       "        \n",
       "        <td>\n",
       "            <div class=\" \"\n",
       "                 title='Found'>Found\n",
       "            </div>\n",
       "        </td>\n",
       "        \n",
       "        <td>\n",
       "            <div class=\" \"\n",
       "                 title='01/01/1980&#8901;12:00:00&#8901;AM'>01/01/1980&#8901;12:00:00&#8901;AM\n",
       "            </div>\n",
       "        </td>\n",
       "        \n",
       "        <td>\n",
       "            <div class=\"none \"\n",
       "                 title='None'>None\n",
       "            </div>\n",
       "        </td>\n",
       "        \n",
       "        <td>\n",
       "            <div class=\"none \"\n",
       "                 title='None'>None\n",
       "            </div>\n",
       "        </td>\n",
       "        \n",
       "        <td>\n",
       "            <div class=\"none \"\n",
       "                 title='None'>None\n",
       "            </div>\n",
       "        </td>\n",
       "        \n",
       "    </tr>\n",
       "    \n",
       "    <tr>\n",
       "        \n",
       "        <td>\n",
       "            <div class=\" \"\n",
       "                 title='Lewis&#8901;Cliff&#8901;86533'>Lewis&#8901;Cliff&#8901;86533\n",
       "            </div>\n",
       "        </td>\n",
       "        \n",
       "        <td>\n",
       "            <div class=\" \"\n",
       "                 title='13461'>13461\n",
       "            </div>\n",
       "        </td>\n",
       "        \n",
       "        <td>\n",
       "            <div class=\" \"\n",
       "                 title='Valid'>Valid\n",
       "            </div>\n",
       "        </td>\n",
       "        \n",
       "        <td>\n",
       "            <div class=\" \"\n",
       "                 title='H5'>H5\n",
       "            </div>\n",
       "        </td>\n",
       "        \n",
       "        <td>\n",
       "            <div class=\" \"\n",
       "                 title='15.7'>15.7\n",
       "            </div>\n",
       "        </td>\n",
       "        \n",
       "        <td>\n",
       "            <div class=\" \"\n",
       "                 title='Found'>Found\n",
       "            </div>\n",
       "        </td>\n",
       "        \n",
       "        <td>\n",
       "            <div class=\" \"\n",
       "                 title='01/01/1986&#8901;12:00:00&#8901;AM'>01/01/1986&#8901;12:00:00&#8901;AM\n",
       "            </div>\n",
       "        </td>\n",
       "        \n",
       "        <td>\n",
       "            <div class=\" \"\n",
       "                 title='-84.26756'>-84.26756\n",
       "            </div>\n",
       "        </td>\n",
       "        \n",
       "        <td>\n",
       "            <div class=\" \"\n",
       "                 title='161.3631'>161.3631\n",
       "            </div>\n",
       "        </td>\n",
       "        \n",
       "        <td>\n",
       "            <div class=\" \"\n",
       "                 title='(-84.267560,&#8901;161.363100)'>(-84.267560,&#8901;161.363100)\n",
       "            </div>\n",
       "        </td>\n",
       "        \n",
       "    </tr>\n",
       "    \n",
       "    <tr>\n",
       "        \n",
       "        <td>\n",
       "            <div class=\" \"\n",
       "                 title='Grove&#8901;Mountains&#8901;053589'>Grove&#8901;Mountains&#8901;053589\n",
       "            </div>\n",
       "        </td>\n",
       "        \n",
       "        <td>\n",
       "            <div class=\" \"\n",
       "                 title='48447'>48447\n",
       "            </div>\n",
       "        </td>\n",
       "        \n",
       "        <td>\n",
       "            <div class=\" \"\n",
       "                 title='Valid'>Valid\n",
       "            </div>\n",
       "        </td>\n",
       "        \n",
       "        <td>\n",
       "            <div class=\" \"\n",
       "                 title='L5'>L5\n",
       "            </div>\n",
       "        </td>\n",
       "        \n",
       "        <td>\n",
       "            <div class=\" \"\n",
       "                 title='1.4'>1.4\n",
       "            </div>\n",
       "        </td>\n",
       "        \n",
       "        <td>\n",
       "            <div class=\" \"\n",
       "                 title='Found'>Found\n",
       "            </div>\n",
       "        </td>\n",
       "        \n",
       "        <td>\n",
       "            <div class=\" \"\n",
       "                 title='01/01/2006&#8901;12:00:00&#8901;AM'>01/01/2006&#8901;12:00:00&#8901;AM\n",
       "            </div>\n",
       "        </td>\n",
       "        \n",
       "        <td>\n",
       "            <div class=\" \"\n",
       "                 title='-72.7825'>-72.7825\n",
       "            </div>\n",
       "        </td>\n",
       "        \n",
       "        <td>\n",
       "            <div class=\" \"\n",
       "                 title='75.300278'>75.300278\n",
       "            </div>\n",
       "        </td>\n",
       "        \n",
       "        <td>\n",
       "            <div class=\" \"\n",
       "                 title='(-72.782500,&#8901;75.300278)'>(-72.782500,&#8901;75.300278)\n",
       "            </div>\n",
       "        </td>\n",
       "        \n",
       "    </tr>\n",
       "    \n",
       "    <tr>\n",
       "        \n",
       "        <td>\n",
       "            <div class=\" \"\n",
       "                 title='Sayh&#8901;al&#8901;Uhaymir&#8901;108'>Sayh&#8901;al&#8901;Uhaymir&#8901;108\n",
       "            </div>\n",
       "        </td>\n",
       "        \n",
       "        <td>\n",
       "            <div class=\" \"\n",
       "                 title='23300'>23300\n",
       "            </div>\n",
       "        </td>\n",
       "        \n",
       "        <td>\n",
       "            <div class=\" \"\n",
       "                 title='Valid'>Valid\n",
       "            </div>\n",
       "        </td>\n",
       "        \n",
       "        <td>\n",
       "            <div class=\" \"\n",
       "                 title='H6'>H6\n",
       "            </div>\n",
       "        </td>\n",
       "        \n",
       "        <td>\n",
       "            <div class=\" \"\n",
       "                 title='16.0'>16.0\n",
       "            </div>\n",
       "        </td>\n",
       "        \n",
       "        <td>\n",
       "            <div class=\" \"\n",
       "                 title='Found'>Found\n",
       "            </div>\n",
       "        </td>\n",
       "        \n",
       "        <td>\n",
       "            <div class=\" \"\n",
       "                 title='01/01/2001&#8901;12:00:00&#8901;AM'>01/01/2001&#8901;12:00:00&#8901;AM\n",
       "            </div>\n",
       "        </td>\n",
       "        \n",
       "        <td>\n",
       "            <div class=\" \"\n",
       "                 title='21.06667'>21.06667\n",
       "            </div>\n",
       "        </td>\n",
       "        \n",
       "        <td>\n",
       "            <div class=\" \"\n",
       "                 title='57.31667'>57.31667\n",
       "            </div>\n",
       "        </td>\n",
       "        \n",
       "        <td>\n",
       "            <div class=\" \"\n",
       "                 title='(21.066670,&#8901;57.316670)'>(21.066670,&#8901;57.316670)\n",
       "            </div>\n",
       "        </td>\n",
       "        \n",
       "    </tr>\n",
       "    \n",
       "    <tr>\n",
       "        \n",
       "        <td>\n",
       "            <div class=\" \"\n",
       "                 title='Northwest&#8901;Africa&#8901;3088'>Northwest&#8901;Africa&#8901;3088\n",
       "            </div>\n",
       "        </td>\n",
       "        \n",
       "        <td>\n",
       "            <div class=\" \"\n",
       "                 title='31218'>31218\n",
       "            </div>\n",
       "        </td>\n",
       "        \n",
       "        <td>\n",
       "            <div class=\" \"\n",
       "                 title='Valid'>Valid\n",
       "            </div>\n",
       "        </td>\n",
       "        \n",
       "        <td>\n",
       "            <div class=\" \"\n",
       "                 title='L6'>L6\n",
       "            </div>\n",
       "        </td>\n",
       "        \n",
       "        <td>\n",
       "            <div class=\" \"\n",
       "                 title='171.0'>171.0\n",
       "            </div>\n",
       "        </td>\n",
       "        \n",
       "        <td>\n",
       "            <div class=\" \"\n",
       "                 title='Found'>Found\n",
       "            </div>\n",
       "        </td>\n",
       "        \n",
       "        <td>\n",
       "            <div class=\" \"\n",
       "                 title='01/01/2003&#8901;12:00:00&#8901;AM'>01/01/2003&#8901;12:00:00&#8901;AM\n",
       "            </div>\n",
       "        </td>\n",
       "        \n",
       "        <td>\n",
       "            <div class=\"none \"\n",
       "                 title='None'>None\n",
       "            </div>\n",
       "        </td>\n",
       "        \n",
       "        <td>\n",
       "            <div class=\"none \"\n",
       "                 title='None'>None\n",
       "            </div>\n",
       "        </td>\n",
       "        \n",
       "        <td>\n",
       "            <div class=\"none \"\n",
       "                 title='None'>None\n",
       "            </div>\n",
       "        </td>\n",
       "        \n",
       "    </tr>\n",
       "    \n",
       "    <tr>\n",
       "        \n",
       "        <td>\n",
       "            <div class=\" \"\n",
       "                 title='Reckling&#8901;Peak&#8901;92423'>Reckling&#8901;Peak&#8901;92423\n",
       "            </div>\n",
       "        </td>\n",
       "        \n",
       "        <td>\n",
       "            <div class=\" \"\n",
       "                 title='22432'>22432\n",
       "            </div>\n",
       "        </td>\n",
       "        \n",
       "        <td>\n",
       "            <div class=\" \"\n",
       "                 title='Valid'>Valid\n",
       "            </div>\n",
       "        </td>\n",
       "        \n",
       "        <td>\n",
       "            <div class=\" \"\n",
       "                 title='L6'>L6\n",
       "            </div>\n",
       "        </td>\n",
       "        \n",
       "        <td>\n",
       "            <div class=\" \"\n",
       "                 title='3.8'>3.8\n",
       "            </div>\n",
       "        </td>\n",
       "        \n",
       "        <td>\n",
       "            <div class=\" \"\n",
       "                 title='Found'>Found\n",
       "            </div>\n",
       "        </td>\n",
       "        \n",
       "        <td>\n",
       "            <div class=\" \"\n",
       "                 title='01/01/1992&#8901;12:00:00&#8901;AM'>01/01/1992&#8901;12:00:00&#8901;AM\n",
       "            </div>\n",
       "        </td>\n",
       "        \n",
       "        <td>\n",
       "            <div class=\" \"\n",
       "                 title='-76.22029'>-76.22029\n",
       "            </div>\n",
       "        </td>\n",
       "        \n",
       "        <td>\n",
       "            <div class=\" \"\n",
       "                 title='158.37967'>158.37967\n",
       "            </div>\n",
       "        </td>\n",
       "        \n",
       "        <td>\n",
       "            <div class=\" \"\n",
       "                 title='(-76.220290,&#8901;158.379670)'>(-76.220290,&#8901;158.379670)\n",
       "            </div>\n",
       "        </td>\n",
       "        \n",
       "    </tr>\n",
       "    \n",
       "    <tr>\n",
       "        \n",
       "        <td>\n",
       "            <div class=\" \"\n",
       "                 title='Sweetwater'>Sweetwater\n",
       "            </div>\n",
       "        </td>\n",
       "        \n",
       "        <td>\n",
       "            <div class=\" \"\n",
       "                 title='23770'>23770\n",
       "            </div>\n",
       "        </td>\n",
       "        \n",
       "        <td>\n",
       "            <div class=\" \"\n",
       "                 title='Valid'>Valid\n",
       "            </div>\n",
       "        </td>\n",
       "        \n",
       "        <td>\n",
       "            <div class=\" \"\n",
       "                 title='H5'>H5\n",
       "            </div>\n",
       "        </td>\n",
       "        \n",
       "        <td>\n",
       "            <div class=\" \"\n",
       "                 title='1760.0'>1760.0\n",
       "            </div>\n",
       "        </td>\n",
       "        \n",
       "        <td>\n",
       "            <div class=\" \"\n",
       "                 title='Found'>Found\n",
       "            </div>\n",
       "        </td>\n",
       "        \n",
       "        <td>\n",
       "            <div class=\" \"\n",
       "                 title='01/01/1961&#8901;12:00:00&#8901;AM'>01/01/1961&#8901;12:00:00&#8901;AM\n",
       "            </div>\n",
       "        </td>\n",
       "        \n",
       "        <td>\n",
       "            <div class=\" \"\n",
       "                 title='32.55'>32.55\n",
       "            </div>\n",
       "        </td>\n",
       "        \n",
       "        <td>\n",
       "            <div class=\" \"\n",
       "                 title='-100.41667'>-100.41667\n",
       "            </div>\n",
       "        </td>\n",
       "        \n",
       "        <td>\n",
       "            <div class=\" \"\n",
       "                 title='(32.550000,&#8901;-100.416670)'>(32.550000,&#8901;-100.416670)\n",
       "            </div>\n",
       "        </td>\n",
       "        \n",
       "    </tr>\n",
       "    \n",
       "    </tbody>\n",
       "</table>\n",
       "\n",
       "\n",
       "<div class=\"info_items\">Viewing 10 of 45716 rows / 10 columns</div>\n",
       "<div class=\"info_items\">32 partition(s)</div>\n"
      ],
      "text/plain": [
       "<IPython.core.display.HTML object>"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "INFO:optimus:run() executed in 19.68 sec\n"
     ]
    }
   ],
   "source": [
    "op.profiler.run(df, \"mass (g)\", infer=False)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 25,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Loading page (1/2)\n",
      "Warning: Failed to load file:///C:/Users/ARGENI~1/AppData/Local/Temp/optimus/styles/styles.css (ignore)\n",
      "Rendering (2/2)                                                    \n",
      "Done                                                               \n"
     ]
    },
    {
     "data": {
      "text/html": [
       "<img src='images/profiler_numeric.png'>"
      ],
      "text/plain": [
       "<IPython.core.display.HTML object>"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    }
   ],
   "source": [
    "op.profiler.to_image(output_path=\"images/profiler_numeric.png\")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 26,
   "metadata": {},
   "outputs": [
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "INFO:optimus:Processing column 'name'...\n"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Including 'nan' as Null in processing 'name'\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "INFO:optimus:_count_data_types() executed in 1.43 sec\n",
      "INFO:optimus:count_data_types() executed in 1.43 sec\n",
      "INFO:optimus:Using 'column_exp' to process column 'name' with function _cast_to\n",
      "INFO:optimus:cast_columns() executed in 0.02 sec\n",
      "INFO:optimus:agg_exprs() executed in 1.94 sec\n",
      "INFO:optimus:general_stats() executed in 1.95 sec\n",
      "INFO:optimus:------------------------------\n",
      "INFO:optimus:Processing column 'name'...\n",
      "INFO:optimus:frequency() executed in 4.21 sec\n",
      "INFO:optimus:stats_by_column() executed in 0.0 sec\n",
      "INFO:optimus:Using 'column_exp' to process column 'name_len' with function func_col_exp\n",
      "INFO:optimus:Using 'column_exp' to process column 'name_len' with function _bucketizer\n",
      "INFO:optimus:bucketizer() executed in 0.35 sec\n",
      "INFO:optimus:hist() executed in 3.02 sec\n",
      "INFO:optimus:hist_string() executed in 5.39 sec\n"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Including 'nan' as Null in processing 'name'\n",
      "Including 'nan' as Null in processing 'nametype'\n",
      "Including 'nan' as Null in processing 'recclass'\n",
      "Including 'nan' as Null in processing 'fall'\n",
      "Including 'nan' as Null in processing 'year'\n",
      "Including 'nan' as Null in processing 'GeoLocation'\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "INFO:optimus:dataset_info() executed in 1.78 sec\n"
     ]
    },
    {
     "data": {
      "text/html": [
       "\n",
       "<div class=\"title_profiler\">\n",
       "    <h1>Overview</h1>\n",
       "</div>\n",
       "<div class=\"main\">\n",
       "    <div class=\"panel_profiler\">\n",
       "        <h2>Dataset info</h2>\n",
       "        <table>\n",
       "            <tbody>\n",
       "            <tr>\n",
       "                <td>Number of columns</td>\n",
       "                <td>10</td>\n",
       "\n",
       "            </tr>\n",
       "            <tr>\n",
       "                <td>Number of rows</td>\n",
       "                <td>45716</td>\n",
       "\n",
       "            </tr>\n",
       "            <tr>\n",
       "                <td>Total Missing (%)</td>\n",
       "                <td>0.49%</td>\n",
       "\n",
       "            </tr>\n",
       "            <tr>\n",
       "                <td>Total size in memory</td>\n",
       "                <td>92.1 MB</td>\n",
       "\n",
       "            </tr>\n",
       "            </tbody>\n",
       "        </table>\n",
       "    </div>\n",
       "    <div class=\"panel_profiler\">\n",
       "        <h2>Column types</h2>\n",
       "        <table>\n",
       "            <tbody>\n",
       "            <tr>\n",
       "                <td>String</td>\n",
       "                <td>1</td>\n",
       "\n",
       "            </tr>\n",
       "            <tr>\n",
       "                <td>Numeric</td>\n",
       "                <td>0</td>\n",
       "\n",
       "            </tr>\n",
       "            <tr>\n",
       "                <td>Date</td>\n",
       "                <td>0</td>\n",
       "\n",
       "            </tr>\n",
       "            <tr>\n",
       "                <td>Bool</td>\n",
       "                <td>0</td>\n",
       "\n",
       "            </tr>\n",
       "            <tr>\n",
       "                <td>Array</td>\n",
       "                <td>0</td>\n",
       "\n",
       "            </tr>\n",
       "            <tr>\n",
       "                <td>Not available</td>\n",
       "                <td>0</td>\n",
       "\n",
       "            </tr>\n",
       "            </tbody>\n",
       "        </table>\n",
       "    </div>\n",
       "</div><link rel=\"stylesheet\" type=\"text/css\" href=\"optimus/styles/styles.css\">\n",
       "\n",
       "\n",
       "<div class=\"main\">\n",
       "    <div class=\"info\">\n",
       "\n",
       "        \n",
       "\n",
       "        <div class=\"panel_profiler\">\n",
       "            <div>\n",
       "                <h2>name</h2>\n",
       "                <span>categorical</span>\n",
       "            </div>\n",
       "            <table>\n",
       "                <tbody>\n",
       "                <tr>\n",
       "                    <td>Unique</td>\n",
       "                    <td> 45515</td>\n",
       "                </tr>\n",
       "                <tr>\n",
       "                    <td>Unique (%)</td>\n",
       "                    <td> 99.56</td>\n",
       "                </tr>\n",
       "                <tr>\n",
       "                    <td>Missing</td>\n",
       "                    <td>0.0</td>\n",
       "                </tr>\n",
       "                <tr>\n",
       "                    <td>Missing (%)</td>\n",
       "                    <td>0</td>\n",
       "                </tr>\n",
       "                </tbody>\n",
       "            </table>\n",
       "            <div>\n",
       "                <h3>\n",
       "                    Datatypes\n",
       "                </h3>\n",
       "            </div>\n",
       "            <table>\n",
       "                <tbody>\n",
       "                <tr>\n",
       "                    <td>\n",
       "                        String\n",
       "                    </td>\n",
       "                    <td>\n",
       "                        45716\n",
       "                    </td>\n",
       "                </tr>\n",
       "                <tr>\n",
       "                    <td>\n",
       "                        Integer\n",
       "                    </td>\n",
       "                    <td>\n",
       "                        0\n",
       "                    </td>\n",
       "                </tr>\n",
       "                <tr>\n",
       "                    <td>\n",
       "                        Float\n",
       "                    </td>\n",
       "                    <td>\n",
       "                        0\n",
       "                    </td>\n",
       "                </tr>\n",
       "                <tr>\n",
       "                    <td>\n",
       "                        Bool\n",
       "                    </td>\n",
       "                    <td>\n",
       "                        0\n",
       "                    </td>\n",
       "                </tr>\n",
       "                <tr>\n",
       "                    <td>\n",
       "                        Date\n",
       "                    </td>\n",
       "                    <td>\n",
       "                        0\n",
       "                    </td>\n",
       "                </tr>\n",
       "                <tr>\n",
       "                    <td>\n",
       "                        Missing\n",
       "                    </td>\n",
       "                    <td>\n",
       "                        0\n",
       "                    </td>\n",
       "                </tr>\n",
       "                <tr>\n",
       "                    <td>\n",
       "                        Null\n",
       "                    </td>\n",
       "                    <td>\n",
       "                        0\n",
       "                    </td>\n",
       "\n",
       "                </tr>\n",
       "                </tbody>\n",
       "            </table>\n",
       "            \n",
       "\n",
       "        </div>\n",
       "        <div class=\"panel_profiler\">\n",
       "            <h3>Frequency</h3>\n",
       "            <table>\n",
       "\n",
       "                <tr>\n",
       "                    <th>Value</th>\n",
       "                    <th>Count</th>\n",
       "                    <th>Frequency (%)</th>\n",
       "                </tr>\n",
       "                \n",
       "                <tr>\n",
       "                    <td>Święcany</td>\n",
       "                    <td>1</td>\n",
       "                    <td>0.002%</td>\n",
       "                </tr>\n",
       "\n",
       "                \n",
       "                <tr>\n",
       "                    <td>Łowicz</td>\n",
       "                    <td>1</td>\n",
       "                    <td>0.002%</td>\n",
       "                </tr>\n",
       "\n",
       "                \n",
       "                <tr>\n",
       "                    <td>Österplana 064</td>\n",
       "                    <td>1</td>\n",
       "                    <td>0.002%</td>\n",
       "                </tr>\n",
       "\n",
       "                \n",
       "                <tr>\n",
       "                    <td>Österplana 063</td>\n",
       "                    <td>1</td>\n",
       "                    <td>0.002%</td>\n",
       "                </tr>\n",
       "\n",
       "                \n",
       "                <tr>\n",
       "                    <td>Österplana 062</td>\n",
       "                    <td>1</td>\n",
       "                    <td>0.002%</td>\n",
       "                </tr>\n",
       "\n",
       "                \n",
       "                <tr>\n",
       "                    <td>Österplana 061</td>\n",
       "                    <td>1</td>\n",
       "                    <td>0.002%</td>\n",
       "                </tr>\n",
       "\n",
       "                \n",
       "                <tr>\n",
       "                    <td>Österplana 060</td>\n",
       "                    <td>1</td>\n",
       "                    <td>0.002%</td>\n",
       "                </tr>\n",
       "\n",
       "                \n",
       "                <tr>\n",
       "                    <td>Österplana 059</td>\n",
       "                    <td>1</td>\n",
       "                    <td>0.002%</td>\n",
       "                </tr>\n",
       "\n",
       "                \n",
       "                <tr>\n",
       "                    <td>Österplana 058</td>\n",
       "                    <td>1</td>\n",
       "                    <td>0.002%</td>\n",
       "                </tr>\n",
       "\n",
       "                \n",
       "                <tr>\n",
       "                    <td>Österplana 057</td>\n",
       "                    <td>1</td>\n",
       "                    <td>0.002%</td>\n",
       "                </tr>\n",
       "\n",
       "                \n",
       "                <tr>\n",
       "                    <td>\"Missing\"</td>\n",
       "                    <td>0</td>\n",
       "                    <td>0.0%</td>\n",
       "                </tr>\n",
       "\n",
       "            </table>\n",
       "        </div>\n",
       "        \n",
       "\n",
       "        \n",
       "    </div>\n",
       "    <table>\n",
       "        \n",
       "        <tr>\n",
       "\n",
       "            <td>\n",
       "\n",
       "                <div>\n",
       "                    <img src=\"\">\n",
       "                </div>\n",
       "            </td>\n",
       "        </tr>\n",
       "        \n",
       "        \n",
       "        <tr>\n",
       "            <td>\n",
       "                <div>\n",
       "                    <img src=\"\">\n",
       "                </div>\n",
       "\n",
       "            </td>\n",
       "\n",
       "        </tr>\n",
       "        \n",
       "        \n",
       "        \n",
       "        \n",
       "        \n",
       "        \n",
       "\n",
       "    </table>\n",
       "</div>\n",
       "\n",
       "\n",
       "\n",
       "\n",
       "<div class=\"info_items\">Viewing 10 of 45716 rows / 10 columns</div>\n",
       "<div class=\"info_items\">32 partition(s)</div>\n",
       "\n",
       "<table class=\"optimus_table\">\n",
       "    <thead>\n",
       "    <tr>\n",
       "        \n",
       "        <th>\n",
       "            <div class=\"column_name\">name</div>\n",
       "            <div class=\"data_type\">1 (string)</div>\n",
       "            <div class=\"data_type\">\n",
       "                \n",
       "                nullable\n",
       "                \n",
       "            </div>\n",
       "        </th>\n",
       "        \n",
       "        <th>\n",
       "            <div class=\"column_name\">id</div>\n",
       "            <div class=\"data_type\">2 (int)</div>\n",
       "            <div class=\"data_type\">\n",
       "                \n",
       "                nullable\n",
       "                \n",
       "            </div>\n",
       "        </th>\n",
       "        \n",
       "        <th>\n",
       "            <div class=\"column_name\">nametype</div>\n",
       "            <div class=\"data_type\">3 (string)</div>\n",
       "            <div class=\"data_type\">\n",
       "                \n",
       "                nullable\n",
       "                \n",
       "            </div>\n",
       "        </th>\n",
       "        \n",
       "        <th>\n",
       "            <div class=\"column_name\">recclass</div>\n",
       "            <div class=\"data_type\">4 (string)</div>\n",
       "            <div class=\"data_type\">\n",
       "                \n",
       "                nullable\n",
       "                \n",
       "            </div>\n",
       "        </th>\n",
       "        \n",
       "        <th>\n",
       "            <div class=\"column_name\">mass (g)</div>\n",
       "            <div class=\"data_type\">5 (double)</div>\n",
       "            <div class=\"data_type\">\n",
       "                \n",
       "                nullable\n",
       "                \n",
       "            </div>\n",
       "        </th>\n",
       "        \n",
       "        <th>\n",
       "            <div class=\"column_name\">fall</div>\n",
       "            <div class=\"data_type\">6 (string)</div>\n",
       "            <div class=\"data_type\">\n",
       "                \n",
       "                nullable\n",
       "                \n",
       "            </div>\n",
       "        </th>\n",
       "        \n",
       "        <th>\n",
       "            <div class=\"column_name\">year</div>\n",
       "            <div class=\"data_type\">7 (string)</div>\n",
       "            <div class=\"data_type\">\n",
       "                \n",
       "                nullable\n",
       "                \n",
       "            </div>\n",
       "        </th>\n",
       "        \n",
       "        <th>\n",
       "            <div class=\"column_name\">reclat</div>\n",
       "            <div class=\"data_type\">8 (double)</div>\n",
       "            <div class=\"data_type\">\n",
       "                \n",
       "                nullable\n",
       "                \n",
       "            </div>\n",
       "        </th>\n",
       "        \n",
       "        <th>\n",
       "            <div class=\"column_name\">reclong</div>\n",
       "            <div class=\"data_type\">9 (double)</div>\n",
       "            <div class=\"data_type\">\n",
       "                \n",
       "                nullable\n",
       "                \n",
       "            </div>\n",
       "        </th>\n",
       "        \n",
       "        <th>\n",
       "            <div class=\"column_name\">GeoLocation</div>\n",
       "            <div class=\"data_type\">10 (string)</div>\n",
       "            <div class=\"data_type\">\n",
       "                \n",
       "                nullable\n",
       "                \n",
       "            </div>\n",
       "        </th>\n",
       "        \n",
       "    </tr>\n",
       "\n",
       "    </thead>\n",
       "    <tbody>\n",
       "    \n",
       "    <tr>\n",
       "        \n",
       "        <td>\n",
       "            <div class=\" \"\n",
       "                 title='Acfer&#8901;232'>Acfer&#8901;232\n",
       "            </div>\n",
       "        </td>\n",
       "        \n",
       "        <td>\n",
       "            <div class=\" \"\n",
       "                 title='240'>240\n",
       "            </div>\n",
       "        </td>\n",
       "        \n",
       "        <td>\n",
       "            <div class=\" \"\n",
       "                 title='Valid'>Valid\n",
       "            </div>\n",
       "        </td>\n",
       "        \n",
       "        <td>\n",
       "            <div class=\" \"\n",
       "                 title='H5'>H5\n",
       "            </div>\n",
       "        </td>\n",
       "        \n",
       "        <td>\n",
       "            <div class=\" \"\n",
       "                 title='725.0'>725.0\n",
       "            </div>\n",
       "        </td>\n",
       "        \n",
       "        <td>\n",
       "            <div class=\" \"\n",
       "                 title='Found'>Found\n",
       "            </div>\n",
       "        </td>\n",
       "        \n",
       "        <td>\n",
       "            <div class=\" \"\n",
       "                 title='01/01/1991&#8901;12:00:00&#8901;AM'>01/01/1991&#8901;12:00:00&#8901;AM\n",
       "            </div>\n",
       "        </td>\n",
       "        \n",
       "        <td>\n",
       "            <div class=\" \"\n",
       "                 title='27.73944'>27.73944\n",
       "            </div>\n",
       "        </td>\n",
       "        \n",
       "        <td>\n",
       "            <div class=\" \"\n",
       "                 title='4.32833'>4.32833\n",
       "            </div>\n",
       "        </td>\n",
       "        \n",
       "        <td>\n",
       "            <div class=\" \"\n",
       "                 title='(27.739440,&#8901;4.328330)'>(27.739440,&#8901;4.328330)\n",
       "            </div>\n",
       "        </td>\n",
       "        \n",
       "    </tr>\n",
       "    \n",
       "    <tr>\n",
       "        \n",
       "        <td>\n",
       "            <div class=\" \"\n",
       "                 title='Asuka&#8901;87197'>Asuka&#8901;87197\n",
       "            </div>\n",
       "        </td>\n",
       "        \n",
       "        <td>\n",
       "            <div class=\" \"\n",
       "                 title='2554'>2554\n",
       "            </div>\n",
       "        </td>\n",
       "        \n",
       "        <td>\n",
       "            <div class=\" \"\n",
       "                 title='Valid'>Valid\n",
       "            </div>\n",
       "        </td>\n",
       "        \n",
       "        <td>\n",
       "            <div class=\" \"\n",
       "                 title='H4'>H4\n",
       "            </div>\n",
       "        </td>\n",
       "        \n",
       "        <td>\n",
       "            <div class=\" \"\n",
       "                 title='124.99'>124.99\n",
       "            </div>\n",
       "        </td>\n",
       "        \n",
       "        <td>\n",
       "            <div class=\" \"\n",
       "                 title='Found'>Found\n",
       "            </div>\n",
       "        </td>\n",
       "        \n",
       "        <td>\n",
       "            <div class=\" \"\n",
       "                 title='01/01/1987&#8901;12:00:00&#8901;AM'>01/01/1987&#8901;12:00:00&#8901;AM\n",
       "            </div>\n",
       "        </td>\n",
       "        \n",
       "        <td>\n",
       "            <div class=\" \"\n",
       "                 title='-72.0'>-72.0\n",
       "            </div>\n",
       "        </td>\n",
       "        \n",
       "        <td>\n",
       "            <div class=\" \"\n",
       "                 title='26.0'>26.0\n",
       "            </div>\n",
       "        </td>\n",
       "        \n",
       "        <td>\n",
       "            <div class=\" \"\n",
       "                 title='(-72.000000,&#8901;26.000000)'>(-72.000000,&#8901;26.000000)\n",
       "            </div>\n",
       "        </td>\n",
       "        \n",
       "    </tr>\n",
       "    \n",
       "    <tr>\n",
       "        \n",
       "        <td>\n",
       "            <div class=\" \"\n",
       "                 title='Gladstone&#8901;(iron)'>Gladstone&#8901;(iron)\n",
       "            </div>\n",
       "        </td>\n",
       "        \n",
       "        <td>\n",
       "            <div class=\" \"\n",
       "                 title='10920'>10920\n",
       "            </div>\n",
       "        </td>\n",
       "        \n",
       "        <td>\n",
       "            <div class=\" \"\n",
       "                 title='Valid'>Valid\n",
       "            </div>\n",
       "        </td>\n",
       "        \n",
       "        <td>\n",
       "            <div class=\" \"\n",
       "                 title='Iron,&#8901;IAB-MG'>Iron,&#8901;IAB-MG\n",
       "            </div>\n",
       "        </td>\n",
       "        \n",
       "        <td>\n",
       "            <div class=\" \"\n",
       "                 title='736600.0'>736600.0\n",
       "            </div>\n",
       "        </td>\n",
       "        \n",
       "        <td>\n",
       "            <div class=\" \"\n",
       "                 title='Found'>Found\n",
       "            </div>\n",
       "        </td>\n",
       "        \n",
       "        <td>\n",
       "            <div class=\" \"\n",
       "                 title='01/01/1915&#8901;12:00:00&#8901;AM'>01/01/1915&#8901;12:00:00&#8901;AM\n",
       "            </div>\n",
       "        </td>\n",
       "        \n",
       "        <td>\n",
       "            <div class=\" \"\n",
       "                 title='-23.9'>-23.9\n",
       "            </div>\n",
       "        </td>\n",
       "        \n",
       "        <td>\n",
       "            <div class=\" \"\n",
       "                 title='151.3'>151.3\n",
       "            </div>\n",
       "        </td>\n",
       "        \n",
       "        <td>\n",
       "            <div class=\" \"\n",
       "                 title='(-23.900000,&#8901;151.300000)'>(-23.900000,&#8901;151.300000)\n",
       "            </div>\n",
       "        </td>\n",
       "        \n",
       "    </tr>\n",
       "    \n",
       "    <tr>\n",
       "        \n",
       "        <td>\n",
       "            <div class=\" \"\n",
       "                 title='Nullarbor&#8901;015'>Nullarbor&#8901;015\n",
       "            </div>\n",
       "        </td>\n",
       "        \n",
       "        <td>\n",
       "            <div class=\" \"\n",
       "                 title='17955'>17955\n",
       "            </div>\n",
       "        </td>\n",
       "        \n",
       "        <td>\n",
       "            <div class=\" \"\n",
       "                 title='Valid'>Valid\n",
       "            </div>\n",
       "        </td>\n",
       "        \n",
       "        <td>\n",
       "            <div class=\" \"\n",
       "                 title='L6'>L6\n",
       "            </div>\n",
       "        </td>\n",
       "        \n",
       "        <td>\n",
       "            <div class=\" \"\n",
       "                 title='3986.0'>3986.0\n",
       "            </div>\n",
       "        </td>\n",
       "        \n",
       "        <td>\n",
       "            <div class=\" \"\n",
       "                 title='Found'>Found\n",
       "            </div>\n",
       "        </td>\n",
       "        \n",
       "        <td>\n",
       "            <div class=\" \"\n",
       "                 title='01/01/1980&#8901;12:00:00&#8901;AM'>01/01/1980&#8901;12:00:00&#8901;AM\n",
       "            </div>\n",
       "        </td>\n",
       "        \n",
       "        <td>\n",
       "            <div class=\"none \"\n",
       "                 title='None'>None\n",
       "            </div>\n",
       "        </td>\n",
       "        \n",
       "        <td>\n",
       "            <div class=\"none \"\n",
       "                 title='None'>None\n",
       "            </div>\n",
       "        </td>\n",
       "        \n",
       "        <td>\n",
       "            <div class=\"none \"\n",
       "                 title='None'>None\n",
       "            </div>\n",
       "        </td>\n",
       "        \n",
       "    </tr>\n",
       "    \n",
       "    <tr>\n",
       "        \n",
       "        <td>\n",
       "            <div class=\" \"\n",
       "                 title='Lewis&#8901;Cliff&#8901;86533'>Lewis&#8901;Cliff&#8901;86533\n",
       "            </div>\n",
       "        </td>\n",
       "        \n",
       "        <td>\n",
       "            <div class=\" \"\n",
       "                 title='13461'>13461\n",
       "            </div>\n",
       "        </td>\n",
       "        \n",
       "        <td>\n",
       "            <div class=\" \"\n",
       "                 title='Valid'>Valid\n",
       "            </div>\n",
       "        </td>\n",
       "        \n",
       "        <td>\n",
       "            <div class=\" \"\n",
       "                 title='H5'>H5\n",
       "            </div>\n",
       "        </td>\n",
       "        \n",
       "        <td>\n",
       "            <div class=\" \"\n",
       "                 title='15.7'>15.7\n",
       "            </div>\n",
       "        </td>\n",
       "        \n",
       "        <td>\n",
       "            <div class=\" \"\n",
       "                 title='Found'>Found\n",
       "            </div>\n",
       "        </td>\n",
       "        \n",
       "        <td>\n",
       "            <div class=\" \"\n",
       "                 title='01/01/1986&#8901;12:00:00&#8901;AM'>01/01/1986&#8901;12:00:00&#8901;AM\n",
       "            </div>\n",
       "        </td>\n",
       "        \n",
       "        <td>\n",
       "            <div class=\" \"\n",
       "                 title='-84.26756'>-84.26756\n",
       "            </div>\n",
       "        </td>\n",
       "        \n",
       "        <td>\n",
       "            <div class=\" \"\n",
       "                 title='161.3631'>161.3631\n",
       "            </div>\n",
       "        </td>\n",
       "        \n",
       "        <td>\n",
       "            <div class=\" \"\n",
       "                 title='(-84.267560,&#8901;161.363100)'>(-84.267560,&#8901;161.363100)\n",
       "            </div>\n",
       "        </td>\n",
       "        \n",
       "    </tr>\n",
       "    \n",
       "    <tr>\n",
       "        \n",
       "        <td>\n",
       "            <div class=\" \"\n",
       "                 title='Grove&#8901;Mountains&#8901;053589'>Grove&#8901;Mountains&#8901;053589\n",
       "            </div>\n",
       "        </td>\n",
       "        \n",
       "        <td>\n",
       "            <div class=\" \"\n",
       "                 title='48447'>48447\n",
       "            </div>\n",
       "        </td>\n",
       "        \n",
       "        <td>\n",
       "            <div class=\" \"\n",
       "                 title='Valid'>Valid\n",
       "            </div>\n",
       "        </td>\n",
       "        \n",
       "        <td>\n",
       "            <div class=\" \"\n",
       "                 title='L5'>L5\n",
       "            </div>\n",
       "        </td>\n",
       "        \n",
       "        <td>\n",
       "            <div class=\" \"\n",
       "                 title='1.4'>1.4\n",
       "            </div>\n",
       "        </td>\n",
       "        \n",
       "        <td>\n",
       "            <div class=\" \"\n",
       "                 title='Found'>Found\n",
       "            </div>\n",
       "        </td>\n",
       "        \n",
       "        <td>\n",
       "            <div class=\" \"\n",
       "                 title='01/01/2006&#8901;12:00:00&#8901;AM'>01/01/2006&#8901;12:00:00&#8901;AM\n",
       "            </div>\n",
       "        </td>\n",
       "        \n",
       "        <td>\n",
       "            <div class=\" \"\n",
       "                 title='-72.7825'>-72.7825\n",
       "            </div>\n",
       "        </td>\n",
       "        \n",
       "        <td>\n",
       "            <div class=\" \"\n",
       "                 title='75.300278'>75.300278\n",
       "            </div>\n",
       "        </td>\n",
       "        \n",
       "        <td>\n",
       "            <div class=\" \"\n",
       "                 title='(-72.782500,&#8901;75.300278)'>(-72.782500,&#8901;75.300278)\n",
       "            </div>\n",
       "        </td>\n",
       "        \n",
       "    </tr>\n",
       "    \n",
       "    <tr>\n",
       "        \n",
       "        <td>\n",
       "            <div class=\" \"\n",
       "                 title='Sayh&#8901;al&#8901;Uhaymir&#8901;108'>Sayh&#8901;al&#8901;Uhaymir&#8901;108\n",
       "            </div>\n",
       "        </td>\n",
       "        \n",
       "        <td>\n",
       "            <div class=\" \"\n",
       "                 title='23300'>23300\n",
       "            </div>\n",
       "        </td>\n",
       "        \n",
       "        <td>\n",
       "            <div class=\" \"\n",
       "                 title='Valid'>Valid\n",
       "            </div>\n",
       "        </td>\n",
       "        \n",
       "        <td>\n",
       "            <div class=\" \"\n",
       "                 title='H6'>H6\n",
       "            </div>\n",
       "        </td>\n",
       "        \n",
       "        <td>\n",
       "            <div class=\" \"\n",
       "                 title='16.0'>16.0\n",
       "            </div>\n",
       "        </td>\n",
       "        \n",
       "        <td>\n",
       "            <div class=\" \"\n",
       "                 title='Found'>Found\n",
       "            </div>\n",
       "        </td>\n",
       "        \n",
       "        <td>\n",
       "            <div class=\" \"\n",
       "                 title='01/01/2001&#8901;12:00:00&#8901;AM'>01/01/2001&#8901;12:00:00&#8901;AM\n",
       "            </div>\n",
       "        </td>\n",
       "        \n",
       "        <td>\n",
       "            <div class=\" \"\n",
       "                 title='21.06667'>21.06667\n",
       "            </div>\n",
       "        </td>\n",
       "        \n",
       "        <td>\n",
       "            <div class=\" \"\n",
       "                 title='57.31667'>57.31667\n",
       "            </div>\n",
       "        </td>\n",
       "        \n",
       "        <td>\n",
       "            <div class=\" \"\n",
       "                 title='(21.066670,&#8901;57.316670)'>(21.066670,&#8901;57.316670)\n",
       "            </div>\n",
       "        </td>\n",
       "        \n",
       "    </tr>\n",
       "    \n",
       "    <tr>\n",
       "        \n",
       "        <td>\n",
       "            <div class=\" \"\n",
       "                 title='Northwest&#8901;Africa&#8901;3088'>Northwest&#8901;Africa&#8901;3088\n",
       "            </div>\n",
       "        </td>\n",
       "        \n",
       "        <td>\n",
       "            <div class=\" \"\n",
       "                 title='31218'>31218\n",
       "            </div>\n",
       "        </td>\n",
       "        \n",
       "        <td>\n",
       "            <div class=\" \"\n",
       "                 title='Valid'>Valid\n",
       "            </div>\n",
       "        </td>\n",
       "        \n",
       "        <td>\n",
       "            <div class=\" \"\n",
       "                 title='L6'>L6\n",
       "            </div>\n",
       "        </td>\n",
       "        \n",
       "        <td>\n",
       "            <div class=\" \"\n",
       "                 title='171.0'>171.0\n",
       "            </div>\n",
       "        </td>\n",
       "        \n",
       "        <td>\n",
       "            <div class=\" \"\n",
       "                 title='Found'>Found\n",
       "            </div>\n",
       "        </td>\n",
       "        \n",
       "        <td>\n",
       "            <div class=\" \"\n",
       "                 title='01/01/2003&#8901;12:00:00&#8901;AM'>01/01/2003&#8901;12:00:00&#8901;AM\n",
       "            </div>\n",
       "        </td>\n",
       "        \n",
       "        <td>\n",
       "            <div class=\"none \"\n",
       "                 title='None'>None\n",
       "            </div>\n",
       "        </td>\n",
       "        \n",
       "        <td>\n",
       "            <div class=\"none \"\n",
       "                 title='None'>None\n",
       "            </div>\n",
       "        </td>\n",
       "        \n",
       "        <td>\n",
       "            <div class=\"none \"\n",
       "                 title='None'>None\n",
       "            </div>\n",
       "        </td>\n",
       "        \n",
       "    </tr>\n",
       "    \n",
       "    <tr>\n",
       "        \n",
       "        <td>\n",
       "            <div class=\" \"\n",
       "                 title='Reckling&#8901;Peak&#8901;92423'>Reckling&#8901;Peak&#8901;92423\n",
       "            </div>\n",
       "        </td>\n",
       "        \n",
       "        <td>\n",
       "            <div class=\" \"\n",
       "                 title='22432'>22432\n",
       "            </div>\n",
       "        </td>\n",
       "        \n",
       "        <td>\n",
       "            <div class=\" \"\n",
       "                 title='Valid'>Valid\n",
       "            </div>\n",
       "        </td>\n",
       "        \n",
       "        <td>\n",
       "            <div class=\" \"\n",
       "                 title='L6'>L6\n",
       "            </div>\n",
       "        </td>\n",
       "        \n",
       "        <td>\n",
       "            <div class=\" \"\n",
       "                 title='3.8'>3.8\n",
       "            </div>\n",
       "        </td>\n",
       "        \n",
       "        <td>\n",
       "            <div class=\" \"\n",
       "                 title='Found'>Found\n",
       "            </div>\n",
       "        </td>\n",
       "        \n",
       "        <td>\n",
       "            <div class=\" \"\n",
       "                 title='01/01/1992&#8901;12:00:00&#8901;AM'>01/01/1992&#8901;12:00:00&#8901;AM\n",
       "            </div>\n",
       "        </td>\n",
       "        \n",
       "        <td>\n",
       "            <div class=\" \"\n",
       "                 title='-76.22029'>-76.22029\n",
       "            </div>\n",
       "        </td>\n",
       "        \n",
       "        <td>\n",
       "            <div class=\" \"\n",
       "                 title='158.37967'>158.37967\n",
       "            </div>\n",
       "        </td>\n",
       "        \n",
       "        <td>\n",
       "            <div class=\" \"\n",
       "                 title='(-76.220290,&#8901;158.379670)'>(-76.220290,&#8901;158.379670)\n",
       "            </div>\n",
       "        </td>\n",
       "        \n",
       "    </tr>\n",
       "    \n",
       "    <tr>\n",
       "        \n",
       "        <td>\n",
       "            <div class=\" \"\n",
       "                 title='Sweetwater'>Sweetwater\n",
       "            </div>\n",
       "        </td>\n",
       "        \n",
       "        <td>\n",
       "            <div class=\" \"\n",
       "                 title='23770'>23770\n",
       "            </div>\n",
       "        </td>\n",
       "        \n",
       "        <td>\n",
       "            <div class=\" \"\n",
       "                 title='Valid'>Valid\n",
       "            </div>\n",
       "        </td>\n",
       "        \n",
       "        <td>\n",
       "            <div class=\" \"\n",
       "                 title='H5'>H5\n",
       "            </div>\n",
       "        </td>\n",
       "        \n",
       "        <td>\n",
       "            <div class=\" \"\n",
       "                 title='1760.0'>1760.0\n",
       "            </div>\n",
       "        </td>\n",
       "        \n",
       "        <td>\n",
       "            <div class=\" \"\n",
       "                 title='Found'>Found\n",
       "            </div>\n",
       "        </td>\n",
       "        \n",
       "        <td>\n",
       "            <div class=\" \"\n",
       "                 title='01/01/1961&#8901;12:00:00&#8901;AM'>01/01/1961&#8901;12:00:00&#8901;AM\n",
       "            </div>\n",
       "        </td>\n",
       "        \n",
       "        <td>\n",
       "            <div class=\" \"\n",
       "                 title='32.55'>32.55\n",
       "            </div>\n",
       "        </td>\n",
       "        \n",
       "        <td>\n",
       "            <div class=\" \"\n",
       "                 title='-100.41667'>-100.41667\n",
       "            </div>\n",
       "        </td>\n",
       "        \n",
       "        <td>\n",
       "            <div class=\" \"\n",
       "                 title='(32.550000,&#8901;-100.416670)'>(32.550000,&#8901;-100.416670)\n",
       "            </div>\n",
       "        </td>\n",
       "        \n",
       "    </tr>\n",
       "    \n",
       "    </tbody>\n",
       "</table>\n",
       "\n",
       "\n",
       "<div class=\"info_items\">Viewing 10 of 45716 rows / 10 columns</div>\n",
       "<div class=\"info_items\">32 partition(s)</div>\n"
      ],
      "text/plain": [
       "<IPython.core.display.HTML object>"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "INFO:optimus:run() executed in 17.73 sec\n"
     ]
    }
   ],
   "source": [
    "op.profiler.run(df, \"name\", infer=False)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 63,
   "metadata": {},
   "outputs": [
    {
     "ename": "TypeError",
     "evalue": "expected string or bytes-like object",
     "output_type": "error",
     "traceback": [
      "\u001b[1;31m---------------------------------------------------------------------------\u001b[0m",
      "\u001b[1;31mTypeError\u001b[0m                                 Traceback (most recent call last)",
      "\u001b[1;32m<ipython-input-63-1304007aee00>\u001b[0m in \u001b[0;36m<module>\u001b[1;34m\u001b[0m\n\u001b[1;32m----> 1\u001b[1;33m \u001b[0mop\u001b[0m\u001b[1;33m.\u001b[0m\u001b[0mprofiler\u001b[0m\u001b[1;33m.\u001b[0m\u001b[0mto_image\u001b[0m\u001b[1;33m(\u001b[0m\u001b[0moutput_path\u001b[0m\u001b[1;33m=\u001b[0m\u001b[1;34m\"images/profiler.png\"\u001b[0m\u001b[1;33m)\u001b[0m\u001b[1;33m\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n\u001b[0m",
      "\u001b[1;32m~\\Documents\\Optimus\\optimus\\profiler\\profiler.py\u001b[0m in \u001b[0;36mto_image\u001b[1;34m(self, output_path)\u001b[0m\n\u001b[0;32m    267\u001b[0m         \"\"\"\n\u001b[0;32m    268\u001b[0m         \u001b[0mcss\u001b[0m \u001b[1;33m=\u001b[0m \u001b[0mabsolute_path\u001b[0m\u001b[1;33m(\u001b[0m\u001b[1;34m\"/css/styles.css\"\u001b[0m\u001b[1;33m)\u001b[0m\u001b[1;33m\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n\u001b[1;32m--> 269\u001b[1;33m         \u001b[0mimgkit\u001b[0m\u001b[1;33m.\u001b[0m\u001b[0mfrom_string\u001b[0m\u001b[1;33m(\u001b[0m\u001b[0mself\u001b[0m\u001b[1;33m.\u001b[0m\u001b[0mhtml\u001b[0m\u001b[1;33m,\u001b[0m \u001b[0moutput_path\u001b[0m\u001b[1;33m,\u001b[0m \u001b[0mcss\u001b[0m\u001b[1;33m=\u001b[0m\u001b[0mcss\u001b[0m\u001b[1;33m)\u001b[0m\u001b[1;33m\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n\u001b[0m\u001b[0;32m    270\u001b[0m \u001b[1;33m\u001b[0m\u001b[0m\n\u001b[0;32m    271\u001b[0m         \u001b[0mprint_html\u001b[0m\u001b[1;33m(\u001b[0m\u001b[1;34m\"<img src='\"\u001b[0m \u001b[1;33m+\u001b[0m \u001b[0moutput_path\u001b[0m \u001b[1;33m+\u001b[0m \u001b[1;34m\"'>\"\u001b[0m\u001b[1;33m)\u001b[0m\u001b[1;33m\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n",
      "\u001b[1;32m~\\Anaconda3\\lib\\site-packages\\imgkit\\api.py\u001b[0m in \u001b[0;36mfrom_string\u001b[1;34m(string, output_path, options, toc, cover, css, config, cover_first)\u001b[0m\n\u001b[0;32m     87\u001b[0m     \"\"\"\n\u001b[0;32m     88\u001b[0m     rtn = IMGKit(string, 'string', options=options, toc=toc, cover=cover, css=css,\n\u001b[1;32m---> 89\u001b[1;33m                  config=config, cover_first=cover_first)\n\u001b[0m\u001b[0;32m     90\u001b[0m     \u001b[1;32mreturn\u001b[0m \u001b[0mrtn\u001b[0m\u001b[1;33m.\u001b[0m\u001b[0mto_img\u001b[0m\u001b[1;33m(\u001b[0m\u001b[0moutput_path\u001b[0m\u001b[1;33m)\u001b[0m\u001b[1;33m\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n\u001b[0;32m     91\u001b[0m \u001b[1;33m\u001b[0m\u001b[0m\n",
      "\u001b[1;32m~\\Anaconda3\\lib\\site-packages\\imgkit\\imgkit.py\u001b[0m in \u001b[0;36m__init__\u001b[1;34m(self, url_or_file, source_type, options, toc, cover, css, config, cover_first)\u001b[0m\n\u001b[0;32m     42\u001b[0m         \u001b[0mself\u001b[0m\u001b[1;33m.\u001b[0m\u001b[0moptions\u001b[0m \u001b[1;33m=\u001b[0m \u001b[1;33m{\u001b[0m\u001b[1;33m}\u001b[0m\u001b[1;33m\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n\u001b[0;32m     43\u001b[0m         \u001b[1;32mif\u001b[0m \u001b[0mself\u001b[0m\u001b[1;33m.\u001b[0m\u001b[0msource\u001b[0m\u001b[1;33m.\u001b[0m\u001b[0misString\u001b[0m\u001b[1;33m(\u001b[0m\u001b[1;33m)\u001b[0m\u001b[1;33m:\u001b[0m\u001b[1;33m\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n\u001b[1;32m---> 44\u001b[1;33m             \u001b[0mself\u001b[0m\u001b[1;33m.\u001b[0m\u001b[0moptions\u001b[0m\u001b[1;33m.\u001b[0m\u001b[0mupdate\u001b[0m\u001b[1;33m(\u001b[0m\u001b[0mself\u001b[0m\u001b[1;33m.\u001b[0m\u001b[0m_find_options_in_meta\u001b[0m\u001b[1;33m(\u001b[0m\u001b[0murl_or_file\u001b[0m\u001b[1;33m)\u001b[0m\u001b[1;33m)\u001b[0m\u001b[1;33m\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n\u001b[0m\u001b[0;32m     45\u001b[0m \u001b[1;33m\u001b[0m\u001b[0m\n\u001b[0;32m     46\u001b[0m         \u001b[1;32mif\u001b[0m \u001b[0moptions\u001b[0m\u001b[1;33m:\u001b[0m\u001b[1;33m\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n",
      "\u001b[1;32m~\\Anaconda3\\lib\\site-packages\\imgkit\\imgkit.py\u001b[0m in \u001b[0;36m_find_options_in_meta\u001b[1;34m(self, content)\u001b[0m\n\u001b[0;32m    199\u001b[0m         \u001b[0mfound\u001b[0m \u001b[1;33m=\u001b[0m \u001b[1;33m{\u001b[0m\u001b[1;33m}\u001b[0m\u001b[1;33m\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n\u001b[0;32m    200\u001b[0m \u001b[1;33m\u001b[0m\u001b[0m\n\u001b[1;32m--> 201\u001b[1;33m         \u001b[1;32mfor\u001b[0m \u001b[0mx\u001b[0m \u001b[1;32min\u001b[0m \u001b[0mre\u001b[0m\u001b[1;33m.\u001b[0m\u001b[0mfindall\u001b[0m\u001b[1;33m(\u001b[0m\u001b[1;34m'<meta [^>]*>'\u001b[0m\u001b[1;33m,\u001b[0m \u001b[0mcontent\u001b[0m\u001b[1;33m)\u001b[0m\u001b[1;33m:\u001b[0m\u001b[1;33m\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n\u001b[0m\u001b[0;32m    202\u001b[0m             \u001b[1;32mif\u001b[0m \u001b[0mre\u001b[0m\u001b[1;33m.\u001b[0m\u001b[0msearch\u001b[0m\u001b[1;33m(\u001b[0m\u001b[1;34m'name=[\"\\']%s'\u001b[0m \u001b[1;33m%\u001b[0m \u001b[0mself\u001b[0m\u001b[1;33m.\u001b[0m\u001b[0mconfig\u001b[0m\u001b[1;33m.\u001b[0m\u001b[0mmeta_tag_prefix\u001b[0m\u001b[1;33m,\u001b[0m \u001b[0mx\u001b[0m\u001b[1;33m)\u001b[0m\u001b[1;33m:\u001b[0m\u001b[1;33m\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n\u001b[0;32m    203\u001b[0m                 name = re.findall('name=[\"\\']%s([^\"\\']*)' %\n",
      "\u001b[1;32m~\\Anaconda3\\lib\\re.py\u001b[0m in \u001b[0;36mfindall\u001b[1;34m(pattern, string, flags)\u001b[0m\n\u001b[0;32m    221\u001b[0m \u001b[1;33m\u001b[0m\u001b[0m\n\u001b[0;32m    222\u001b[0m     Empty matches are included in the result.\"\"\"\n\u001b[1;32m--> 223\u001b[1;33m     \u001b[1;32mreturn\u001b[0m \u001b[0m_compile\u001b[0m\u001b[1;33m(\u001b[0m\u001b[0mpattern\u001b[0m\u001b[1;33m,\u001b[0m \u001b[0mflags\u001b[0m\u001b[1;33m)\u001b[0m\u001b[1;33m.\u001b[0m\u001b[0mfindall\u001b[0m\u001b[1;33m(\u001b[0m\u001b[0mstring\u001b[0m\u001b[1;33m)\u001b[0m\u001b[1;33m\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n\u001b[0m\u001b[0;32m    224\u001b[0m \u001b[1;33m\u001b[0m\u001b[0m\n\u001b[0;32m    225\u001b[0m \u001b[1;32mdef\u001b[0m \u001b[0mfinditer\u001b[0m\u001b[1;33m(\u001b[0m\u001b[0mpattern\u001b[0m\u001b[1;33m,\u001b[0m \u001b[0mstring\u001b[0m\u001b[1;33m,\u001b[0m \u001b[0mflags\u001b[0m\u001b[1;33m=\u001b[0m\u001b[1;36m0\u001b[0m\u001b[1;33m)\u001b[0m\u001b[1;33m:\u001b[0m\u001b[1;33m\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n",
      "\u001b[1;31mTypeError\u001b[0m: expected string or bytes-like object"
     ]
    }
   ],
   "source": [
    "op.profiler.to_image(output_path=\"images/profiler.png\")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Processing Dates"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "lines_to_next_cell": 0
   },
   "source": [
    "For dates data types Optimus can give you extra information"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 28,
   "metadata": {},
   "outputs": [
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "INFO:optimus:Processing column 'year'...\n",
      "INFO:optimus:_count_data_types() executed in 23.75 sec\n",
      "INFO:optimus:count_data_types() executed in 23.75 sec\n",
      "INFO:optimus:cast_columns() executed in 0.0 sec\n",
      "INFO:optimus:agg_exprs() executed in 1.59 sec\n",
      "INFO:optimus:general_stats() executed in 1.6 sec\n",
      "INFO:optimus:------------------------------\n",
      "INFO:optimus:Processing column 'year'...\n",
      "INFO:optimus:frequency() executed in 3.26 sec\n",
      "INFO:optimus:stats_by_column() executed in 0.0 sec\n",
      "INFO:optimus:Using 'pandas_udf' to process column 'year' with function func_infer_date\n",
      "INFO:optimus:Using 'column_exp' to process column 'year_0' with function _bucketizer\n",
      "INFO:optimus:bucketizer() executed in 0.81 sec\n",
      "INFO:optimus:hist() executed in 3.44 sec\n",
      "INFO:optimus:Using 'column_exp' to process column 'year_1' with function _bucketizer\n",
      "INFO:optimus:bucketizer() executed in 0.12 sec\n",
      "INFO:optimus:hist() executed in 1.61 sec\n",
      "INFO:optimus:Using 'column_exp' to process column 'year_2' with function _bucketizer\n",
      "INFO:optimus:bucketizer() executed in 0.1 sec\n",
      "INFO:optimus:hist() executed in 1.76 sec\n",
      "INFO:optimus:Using 'column_exp' to process column 'year_3' with function _bucketizer\n",
      "INFO:optimus:bucketizer() executed in 0.22 sec\n",
      "INFO:optimus:hist() executed in 1.7 sec\n",
      "INFO:optimus:Using 'column_exp' to process column 'year_4' with function _bucketizer\n",
      "INFO:optimus:bucketizer() executed in 0.52 sec\n",
      "INFO:optimus:hist() executed in 1.97 sec\n",
      "INFO:optimus:hist_date() executed in 62.45 sec\n"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Including 'nan' as Null in processing 'name'\n",
      "Including 'nan' as Null in processing 'nametype'\n",
      "Including 'nan' as Null in processing 'recclass'\n",
      "Including 'nan' as Null in processing 'fall'\n",
      "Including 'nan' as Null in processing 'year'\n",
      "Including 'nan' as Null in processing 'GeoLocation'\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "INFO:optimus:dataset_info() executed in 1.79 sec\n"
     ]
    },
    {
     "data": {
      "text/html": [
       "\n",
       "<div class=\"title_profiler\">\n",
       "    <h1>Overview</h1>\n",
       "</div>\n",
       "<div class=\"main\">\n",
       "    <div class=\"panel_profiler\">\n",
       "        <h2>Dataset info</h2>\n",
       "        <table>\n",
       "            <tbody>\n",
       "            <tr>\n",
       "                <td>Number of columns</td>\n",
       "                <td>10</td>\n",
       "\n",
       "            </tr>\n",
       "            <tr>\n",
       "                <td>Number of rows</td>\n",
       "                <td>45716</td>\n",
       "\n",
       "            </tr>\n",
       "            <tr>\n",
       "                <td>Total Missing (%)</td>\n",
       "                <td>0.49%</td>\n",
       "\n",
       "            </tr>\n",
       "            <tr>\n",
       "                <td>Total size in memory</td>\n",
       "                <td>97.6 MB</td>\n",
       "\n",
       "            </tr>\n",
       "            </tbody>\n",
       "        </table>\n",
       "    </div>\n",
       "    <div class=\"panel_profiler\">\n",
       "        <h2>Column types</h2>\n",
       "        <table>\n",
       "            <tbody>\n",
       "            <tr>\n",
       "                <td>String</td>\n",
       "                <td>0</td>\n",
       "\n",
       "            </tr>\n",
       "            <tr>\n",
       "                <td>Numeric</td>\n",
       "                <td>0</td>\n",
       "\n",
       "            </tr>\n",
       "            <tr>\n",
       "                <td>Date</td>\n",
       "                <td>1</td>\n",
       "\n",
       "            </tr>\n",
       "            <tr>\n",
       "                <td>Bool</td>\n",
       "                <td>0</td>\n",
       "\n",
       "            </tr>\n",
       "            <tr>\n",
       "                <td>Array</td>\n",
       "                <td>0</td>\n",
       "\n",
       "            </tr>\n",
       "            <tr>\n",
       "                <td>Not available</td>\n",
       "                <td>0</td>\n",
       "\n",
       "            </tr>\n",
       "            </tbody>\n",
       "        </table>\n",
       "    </div>\n",
       "</div><link rel=\"stylesheet\" type=\"text/css\" href=\"optimus/styles/styles.css\">\n",
       "\n",
       "\n",
       "<div class=\"main\">\n",
       "    <div class=\"info\">\n",
       "\n",
       "        \n",
       "\n",
       "        <div class=\"panel_profiler\">\n",
       "            <div>\n",
       "                <h2>year</h2>\n",
       "                <span>date</span>\n",
       "            </div>\n",
       "            <table>\n",
       "                <tbody>\n",
       "                <tr>\n",
       "                    <td>Unique</td>\n",
       "                    <td> 265</td>\n",
       "                </tr>\n",
       "                <tr>\n",
       "                    <td>Unique (%)</td>\n",
       "                    <td> 0.58</td>\n",
       "                </tr>\n",
       "                <tr>\n",
       "                    <td>Missing</td>\n",
       "                    <td>0.0</td>\n",
       "                </tr>\n",
       "                <tr>\n",
       "                    <td>Missing (%)</td>\n",
       "                    <td>0</td>\n",
       "                </tr>\n",
       "                </tbody>\n",
       "            </table>\n",
       "            <div>\n",
       "                <h3>\n",
       "                    Datatypes\n",
       "                </h3>\n",
       "            </div>\n",
       "            <table>\n",
       "                <tbody>\n",
       "                <tr>\n",
       "                    <td>\n",
       "                        String\n",
       "                    </td>\n",
       "                    <td>\n",
       "                        0\n",
       "                    </td>\n",
       "                </tr>\n",
       "                <tr>\n",
       "                    <td>\n",
       "                        Integer\n",
       "                    </td>\n",
       "                    <td>\n",
       "                        0\n",
       "                    </td>\n",
       "                </tr>\n",
       "                <tr>\n",
       "                    <td>\n",
       "                        Float\n",
       "                    </td>\n",
       "                    <td>\n",
       "                        0\n",
       "                    </td>\n",
       "                </tr>\n",
       "                <tr>\n",
       "                    <td>\n",
       "                        Bool\n",
       "                    </td>\n",
       "                    <td>\n",
       "                        0\n",
       "                    </td>\n",
       "                </tr>\n",
       "                <tr>\n",
       "                    <td>\n",
       "                        Date\n",
       "                    </td>\n",
       "                    <td>\n",
       "                        45428\n",
       "                    </td>\n",
       "                </tr>\n",
       "                <tr>\n",
       "                    <td>\n",
       "                        Missing\n",
       "                    </td>\n",
       "                    <td>\n",
       "                        0\n",
       "                    </td>\n",
       "                </tr>\n",
       "                <tr>\n",
       "                    <td>\n",
       "                        Null\n",
       "                    </td>\n",
       "                    <td>\n",
       "                        288\n",
       "                    </td>\n",
       "\n",
       "                </tr>\n",
       "                </tbody>\n",
       "            </table>\n",
       "            \n",
       "\n",
       "        </div>\n",
       "        <div class=\"panel_profiler\">\n",
       "            <h3>Frequency</h3>\n",
       "            <table>\n",
       "\n",
       "                <tr>\n",
       "                    <th>Value</th>\n",
       "                    <th>Count</th>\n",
       "                    <th>Frequency (%)</th>\n",
       "                </tr>\n",
       "                \n",
       "                <tr>\n",
       "                    <td>01/01/2003 12:00:00 AM</td>\n",
       "                    <td>3323</td>\n",
       "                    <td>7.269%</td>\n",
       "                </tr>\n",
       "\n",
       "                \n",
       "                <tr>\n",
       "                    <td>01/01/1979 12:00:00 AM</td>\n",
       "                    <td>3046</td>\n",
       "                    <td>6.663%</td>\n",
       "                </tr>\n",
       "\n",
       "                \n",
       "                <tr>\n",
       "                    <td>01/01/1998 12:00:00 AM</td>\n",
       "                    <td>2697</td>\n",
       "                    <td>5.899%</td>\n",
       "                </tr>\n",
       "\n",
       "                \n",
       "                <tr>\n",
       "                    <td>01/01/2006 12:00:00 AM</td>\n",
       "                    <td>2456</td>\n",
       "                    <td>5.372%</td>\n",
       "                </tr>\n",
       "\n",
       "                \n",
       "                <tr>\n",
       "                    <td>01/01/1988 12:00:00 AM</td>\n",
       "                    <td>2296</td>\n",
       "                    <td>5.022%</td>\n",
       "                </tr>\n",
       "\n",
       "                \n",
       "                <tr>\n",
       "                    <td>01/01/2002 12:00:00 AM</td>\n",
       "                    <td>2078</td>\n",
       "                    <td>4.545%</td>\n",
       "                </tr>\n",
       "\n",
       "                \n",
       "                <tr>\n",
       "                    <td>01/01/2004 12:00:00 AM</td>\n",
       "                    <td>1940</td>\n",
       "                    <td>4.244%</td>\n",
       "                </tr>\n",
       "\n",
       "                \n",
       "                <tr>\n",
       "                    <td>01/01/2000 12:00:00 AM</td>\n",
       "                    <td>1792</td>\n",
       "                    <td>3.92%</td>\n",
       "                </tr>\n",
       "\n",
       "                \n",
       "                <tr>\n",
       "                    <td>01/01/1997 12:00:00 AM</td>\n",
       "                    <td>1696</td>\n",
       "                    <td>3.71%</td>\n",
       "                </tr>\n",
       "\n",
       "                \n",
       "                <tr>\n",
       "                    <td>01/01/1999 12:00:00 AM</td>\n",
       "                    <td>1691</td>\n",
       "                    <td>3.699%</td>\n",
       "                </tr>\n",
       "\n",
       "                \n",
       "                <tr>\n",
       "                    <td>\"Missing\"</td>\n",
       "                    <td>0</td>\n",
       "                    <td>0.0%</td>\n",
       "                </tr>\n",
       "\n",
       "            </table>\n",
       "        </div>\n",
       "        \n",
       "\n",
       "        \n",
       "    </div>\n",
       "    <table>\n",
       "        \n",
       "        \n",
       "        <tr>\n",
       "            <td>\n",
       "                <div>\n",
       "                    <img src=\"\">\n",
       "                </div>\n",
       "\n",
       "            </td>\n",
       "\n",
       "        </tr>\n",
       "        \n",
       "        \n",
       "        <tr>\n",
       "            <td>\n",
       "                <div>\n",
       "                    <img src=\"\">\n",
       "                </div>\n",
       "\n",
       "            </td>\n",
       "\n",
       "        </tr>\n",
       "        \n",
       "        \n",
       "        <tr>\n",
       "            <td>\n",
       "                <div>\n",
       "                    <img src=\"\">\n",
       "                </div>\n",
       "\n",
       "            </td>\n",
       "\n",
       "        </tr>\n",
       "        \n",
       "        \n",
       "        <tr>\n",
       "            <td>\n",
       "                <div>\n",
       "                    <img src=\"\">\n",
       "                </div>\n",
       "\n",
       "            </td>\n",
       "\n",
       "        </tr>\n",
       "        \n",
       "        \n",
       "        <tr>\n",
       "            <td>\n",
       "                <div>\n",
       "                    <img src=\"\">\n",
       "                </div>\n",
       "\n",
       "            </td>\n",
       "\n",
       "        </tr>\n",
       "        \n",
       "        \n",
       "        <tr>\n",
       "            <td>\n",
       "                <div>\n",
       "                    <img src=\"\">\n",
       "                </div>\n",
       "\n",
       "            </td>\n",
       "\n",
       "        </tr>\n",
       "        \n",
       "\n",
       "    </table>\n",
       "</div>\n",
       "\n",
       "\n",
       "\n",
       "\n",
       "<div class=\"info_items\">Viewing 10 of 45716 rows / 10 columns</div>\n",
       "<div class=\"info_items\">32 partition(s)</div>\n",
       "\n",
       "<table class=\"optimus_table\">\n",
       "    <thead>\n",
       "    <tr>\n",
       "        \n",
       "        <th>\n",
       "            <div class=\"column_name\">name</div>\n",
       "            <div class=\"data_type\">1 (string)</div>\n",
       "            <div class=\"data_type\">\n",
       "                \n",
       "                nullable\n",
       "                \n",
       "            </div>\n",
       "        </th>\n",
       "        \n",
       "        <th>\n",
       "            <div class=\"column_name\">id</div>\n",
       "            <div class=\"data_type\">2 (int)</div>\n",
       "            <div class=\"data_type\">\n",
       "                \n",
       "                nullable\n",
       "                \n",
       "            </div>\n",
       "        </th>\n",
       "        \n",
       "        <th>\n",
       "            <div class=\"column_name\">nametype</div>\n",
       "            <div class=\"data_type\">3 (string)</div>\n",
       "            <div class=\"data_type\">\n",
       "                \n",
       "                nullable\n",
       "                \n",
       "            </div>\n",
       "        </th>\n",
       "        \n",
       "        <th>\n",
       "            <div class=\"column_name\">recclass</div>\n",
       "            <div class=\"data_type\">4 (string)</div>\n",
       "            <div class=\"data_type\">\n",
       "                \n",
       "                nullable\n",
       "                \n",
       "            </div>\n",
       "        </th>\n",
       "        \n",
       "        <th>\n",
       "            <div class=\"column_name\">mass (g)</div>\n",
       "            <div class=\"data_type\">5 (double)</div>\n",
       "            <div class=\"data_type\">\n",
       "                \n",
       "                nullable\n",
       "                \n",
       "            </div>\n",
       "        </th>\n",
       "        \n",
       "        <th>\n",
       "            <div class=\"column_name\">fall</div>\n",
       "            <div class=\"data_type\">6 (string)</div>\n",
       "            <div class=\"data_type\">\n",
       "                \n",
       "                nullable\n",
       "                \n",
       "            </div>\n",
       "        </th>\n",
       "        \n",
       "        <th>\n",
       "            <div class=\"column_name\">year</div>\n",
       "            <div class=\"data_type\">7 (string)</div>\n",
       "            <div class=\"data_type\">\n",
       "                \n",
       "                nullable\n",
       "                \n",
       "            </div>\n",
       "        </th>\n",
       "        \n",
       "        <th>\n",
       "            <div class=\"column_name\">reclat</div>\n",
       "            <div class=\"data_type\">8 (double)</div>\n",
       "            <div class=\"data_type\">\n",
       "                \n",
       "                nullable\n",
       "                \n",
       "            </div>\n",
       "        </th>\n",
       "        \n",
       "        <th>\n",
       "            <div class=\"column_name\">reclong</div>\n",
       "            <div class=\"data_type\">9 (double)</div>\n",
       "            <div class=\"data_type\">\n",
       "                \n",
       "                nullable\n",
       "                \n",
       "            </div>\n",
       "        </th>\n",
       "        \n",
       "        <th>\n",
       "            <div class=\"column_name\">GeoLocation</div>\n",
       "            <div class=\"data_type\">10 (string)</div>\n",
       "            <div class=\"data_type\">\n",
       "                \n",
       "                nullable\n",
       "                \n",
       "            </div>\n",
       "        </th>\n",
       "        \n",
       "    </tr>\n",
       "\n",
       "    </thead>\n",
       "    <tbody>\n",
       "    \n",
       "    <tr>\n",
       "        \n",
       "        <td>\n",
       "            <div class=\" \"\n",
       "                 title='Acfer&#8901;232'>Acfer&#8901;232\n",
       "            </div>\n",
       "        </td>\n",
       "        \n",
       "        <td>\n",
       "            <div class=\" \"\n",
       "                 title='240'>240\n",
       "            </div>\n",
       "        </td>\n",
       "        \n",
       "        <td>\n",
       "            <div class=\" \"\n",
       "                 title='Valid'>Valid\n",
       "            </div>\n",
       "        </td>\n",
       "        \n",
       "        <td>\n",
       "            <div class=\" \"\n",
       "                 title='H5'>H5\n",
       "            </div>\n",
       "        </td>\n",
       "        \n",
       "        <td>\n",
       "            <div class=\" \"\n",
       "                 title='725.0'>725.0\n",
       "            </div>\n",
       "        </td>\n",
       "        \n",
       "        <td>\n",
       "            <div class=\" \"\n",
       "                 title='Found'>Found\n",
       "            </div>\n",
       "        </td>\n",
       "        \n",
       "        <td>\n",
       "            <div class=\" \"\n",
       "                 title='01/01/1991&#8901;12:00:00&#8901;AM'>01/01/1991&#8901;12:00:00&#8901;AM\n",
       "            </div>\n",
       "        </td>\n",
       "        \n",
       "        <td>\n",
       "            <div class=\" \"\n",
       "                 title='27.73944'>27.73944\n",
       "            </div>\n",
       "        </td>\n",
       "        \n",
       "        <td>\n",
       "            <div class=\" \"\n",
       "                 title='4.32833'>4.32833\n",
       "            </div>\n",
       "        </td>\n",
       "        \n",
       "        <td>\n",
       "            <div class=\" \"\n",
       "                 title='(27.739440,&#8901;4.328330)'>(27.739440,&#8901;4.328330)\n",
       "            </div>\n",
       "        </td>\n",
       "        \n",
       "    </tr>\n",
       "    \n",
       "    <tr>\n",
       "        \n",
       "        <td>\n",
       "            <div class=\" \"\n",
       "                 title='Asuka&#8901;87197'>Asuka&#8901;87197\n",
       "            </div>\n",
       "        </td>\n",
       "        \n",
       "        <td>\n",
       "            <div class=\" \"\n",
       "                 title='2554'>2554\n",
       "            </div>\n",
       "        </td>\n",
       "        \n",
       "        <td>\n",
       "            <div class=\" \"\n",
       "                 title='Valid'>Valid\n",
       "            </div>\n",
       "        </td>\n",
       "        \n",
       "        <td>\n",
       "            <div class=\" \"\n",
       "                 title='H4'>H4\n",
       "            </div>\n",
       "        </td>\n",
       "        \n",
       "        <td>\n",
       "            <div class=\" \"\n",
       "                 title='124.99'>124.99\n",
       "            </div>\n",
       "        </td>\n",
       "        \n",
       "        <td>\n",
       "            <div class=\" \"\n",
       "                 title='Found'>Found\n",
       "            </div>\n",
       "        </td>\n",
       "        \n",
       "        <td>\n",
       "            <div class=\" \"\n",
       "                 title='01/01/1987&#8901;12:00:00&#8901;AM'>01/01/1987&#8901;12:00:00&#8901;AM\n",
       "            </div>\n",
       "        </td>\n",
       "        \n",
       "        <td>\n",
       "            <div class=\" \"\n",
       "                 title='-72.0'>-72.0\n",
       "            </div>\n",
       "        </td>\n",
       "        \n",
       "        <td>\n",
       "            <div class=\" \"\n",
       "                 title='26.0'>26.0\n",
       "            </div>\n",
       "        </td>\n",
       "        \n",
       "        <td>\n",
       "            <div class=\" \"\n",
       "                 title='(-72.000000,&#8901;26.000000)'>(-72.000000,&#8901;26.000000)\n",
       "            </div>\n",
       "        </td>\n",
       "        \n",
       "    </tr>\n",
       "    \n",
       "    <tr>\n",
       "        \n",
       "        <td>\n",
       "            <div class=\" \"\n",
       "                 title='Gladstone&#8901;(iron)'>Gladstone&#8901;(iron)\n",
       "            </div>\n",
       "        </td>\n",
       "        \n",
       "        <td>\n",
       "            <div class=\" \"\n",
       "                 title='10920'>10920\n",
       "            </div>\n",
       "        </td>\n",
       "        \n",
       "        <td>\n",
       "            <div class=\" \"\n",
       "                 title='Valid'>Valid\n",
       "            </div>\n",
       "        </td>\n",
       "        \n",
       "        <td>\n",
       "            <div class=\" \"\n",
       "                 title='Iron,&#8901;IAB-MG'>Iron,&#8901;IAB-MG\n",
       "            </div>\n",
       "        </td>\n",
       "        \n",
       "        <td>\n",
       "            <div class=\" \"\n",
       "                 title='736600.0'>736600.0\n",
       "            </div>\n",
       "        </td>\n",
       "        \n",
       "        <td>\n",
       "            <div class=\" \"\n",
       "                 title='Found'>Found\n",
       "            </div>\n",
       "        </td>\n",
       "        \n",
       "        <td>\n",
       "            <div class=\" \"\n",
       "                 title='01/01/1915&#8901;12:00:00&#8901;AM'>01/01/1915&#8901;12:00:00&#8901;AM\n",
       "            </div>\n",
       "        </td>\n",
       "        \n",
       "        <td>\n",
       "            <div class=\" \"\n",
       "                 title='-23.9'>-23.9\n",
       "            </div>\n",
       "        </td>\n",
       "        \n",
       "        <td>\n",
       "            <div class=\" \"\n",
       "                 title='151.3'>151.3\n",
       "            </div>\n",
       "        </td>\n",
       "        \n",
       "        <td>\n",
       "            <div class=\" \"\n",
       "                 title='(-23.900000,&#8901;151.300000)'>(-23.900000,&#8901;151.300000)\n",
       "            </div>\n",
       "        </td>\n",
       "        \n",
       "    </tr>\n",
       "    \n",
       "    <tr>\n",
       "        \n",
       "        <td>\n",
       "            <div class=\" \"\n",
       "                 title='Nullarbor&#8901;015'>Nullarbor&#8901;015\n",
       "            </div>\n",
       "        </td>\n",
       "        \n",
       "        <td>\n",
       "            <div class=\" \"\n",
       "                 title='17955'>17955\n",
       "            </div>\n",
       "        </td>\n",
       "        \n",
       "        <td>\n",
       "            <div class=\" \"\n",
       "                 title='Valid'>Valid\n",
       "            </div>\n",
       "        </td>\n",
       "        \n",
       "        <td>\n",
       "            <div class=\" \"\n",
       "                 title='L6'>L6\n",
       "            </div>\n",
       "        </td>\n",
       "        \n",
       "        <td>\n",
       "            <div class=\" \"\n",
       "                 title='3986.0'>3986.0\n",
       "            </div>\n",
       "        </td>\n",
       "        \n",
       "        <td>\n",
       "            <div class=\" \"\n",
       "                 title='Found'>Found\n",
       "            </div>\n",
       "        </td>\n",
       "        \n",
       "        <td>\n",
       "            <div class=\" \"\n",
       "                 title='01/01/1980&#8901;12:00:00&#8901;AM'>01/01/1980&#8901;12:00:00&#8901;AM\n",
       "            </div>\n",
       "        </td>\n",
       "        \n",
       "        <td>\n",
       "            <div class=\"none \"\n",
       "                 title='None'>None\n",
       "            </div>\n",
       "        </td>\n",
       "        \n",
       "        <td>\n",
       "            <div class=\"none \"\n",
       "                 title='None'>None\n",
       "            </div>\n",
       "        </td>\n",
       "        \n",
       "        <td>\n",
       "            <div class=\"none \"\n",
       "                 title='None'>None\n",
       "            </div>\n",
       "        </td>\n",
       "        \n",
       "    </tr>\n",
       "    \n",
       "    <tr>\n",
       "        \n",
       "        <td>\n",
       "            <div class=\" \"\n",
       "                 title='Lewis&#8901;Cliff&#8901;86533'>Lewis&#8901;Cliff&#8901;86533\n",
       "            </div>\n",
       "        </td>\n",
       "        \n",
       "        <td>\n",
       "            <div class=\" \"\n",
       "                 title='13461'>13461\n",
       "            </div>\n",
       "        </td>\n",
       "        \n",
       "        <td>\n",
       "            <div class=\" \"\n",
       "                 title='Valid'>Valid\n",
       "            </div>\n",
       "        </td>\n",
       "        \n",
       "        <td>\n",
       "            <div class=\" \"\n",
       "                 title='H5'>H5\n",
       "            </div>\n",
       "        </td>\n",
       "        \n",
       "        <td>\n",
       "            <div class=\" \"\n",
       "                 title='15.7'>15.7\n",
       "            </div>\n",
       "        </td>\n",
       "        \n",
       "        <td>\n",
       "            <div class=\" \"\n",
       "                 title='Found'>Found\n",
       "            </div>\n",
       "        </td>\n",
       "        \n",
       "        <td>\n",
       "            <div class=\" \"\n",
       "                 title='01/01/1986&#8901;12:00:00&#8901;AM'>01/01/1986&#8901;12:00:00&#8901;AM\n",
       "            </div>\n",
       "        </td>\n",
       "        \n",
       "        <td>\n",
       "            <div class=\" \"\n",
       "                 title='-84.26756'>-84.26756\n",
       "            </div>\n",
       "        </td>\n",
       "        \n",
       "        <td>\n",
       "            <div class=\" \"\n",
       "                 title='161.3631'>161.3631\n",
       "            </div>\n",
       "        </td>\n",
       "        \n",
       "        <td>\n",
       "            <div class=\" \"\n",
       "                 title='(-84.267560,&#8901;161.363100)'>(-84.267560,&#8901;161.363100)\n",
       "            </div>\n",
       "        </td>\n",
       "        \n",
       "    </tr>\n",
       "    \n",
       "    <tr>\n",
       "        \n",
       "        <td>\n",
       "            <div class=\" \"\n",
       "                 title='Grove&#8901;Mountains&#8901;053589'>Grove&#8901;Mountains&#8901;053589\n",
       "            </div>\n",
       "        </td>\n",
       "        \n",
       "        <td>\n",
       "            <div class=\" \"\n",
       "                 title='48447'>48447\n",
       "            </div>\n",
       "        </td>\n",
       "        \n",
       "        <td>\n",
       "            <div class=\" \"\n",
       "                 title='Valid'>Valid\n",
       "            </div>\n",
       "        </td>\n",
       "        \n",
       "        <td>\n",
       "            <div class=\" \"\n",
       "                 title='L5'>L5\n",
       "            </div>\n",
       "        </td>\n",
       "        \n",
       "        <td>\n",
       "            <div class=\" \"\n",
       "                 title='1.4'>1.4\n",
       "            </div>\n",
       "        </td>\n",
       "        \n",
       "        <td>\n",
       "            <div class=\" \"\n",
       "                 title='Found'>Found\n",
       "            </div>\n",
       "        </td>\n",
       "        \n",
       "        <td>\n",
       "            <div class=\" \"\n",
       "                 title='01/01/2006&#8901;12:00:00&#8901;AM'>01/01/2006&#8901;12:00:00&#8901;AM\n",
       "            </div>\n",
       "        </td>\n",
       "        \n",
       "        <td>\n",
       "            <div class=\" \"\n",
       "                 title='-72.7825'>-72.7825\n",
       "            </div>\n",
       "        </td>\n",
       "        \n",
       "        <td>\n",
       "            <div class=\" \"\n",
       "                 title='75.300278'>75.300278\n",
       "            </div>\n",
       "        </td>\n",
       "        \n",
       "        <td>\n",
       "            <div class=\" \"\n",
       "                 title='(-72.782500,&#8901;75.300278)'>(-72.782500,&#8901;75.300278)\n",
       "            </div>\n",
       "        </td>\n",
       "        \n",
       "    </tr>\n",
       "    \n",
       "    <tr>\n",
       "        \n",
       "        <td>\n",
       "            <div class=\" \"\n",
       "                 title='Sayh&#8901;al&#8901;Uhaymir&#8901;108'>Sayh&#8901;al&#8901;Uhaymir&#8901;108\n",
       "            </div>\n",
       "        </td>\n",
       "        \n",
       "        <td>\n",
       "            <div class=\" \"\n",
       "                 title='23300'>23300\n",
       "            </div>\n",
       "        </td>\n",
       "        \n",
       "        <td>\n",
       "            <div class=\" \"\n",
       "                 title='Valid'>Valid\n",
       "            </div>\n",
       "        </td>\n",
       "        \n",
       "        <td>\n",
       "            <div class=\" \"\n",
       "                 title='H6'>H6\n",
       "            </div>\n",
       "        </td>\n",
       "        \n",
       "        <td>\n",
       "            <div class=\" \"\n",
       "                 title='16.0'>16.0\n",
       "            </div>\n",
       "        </td>\n",
       "        \n",
       "        <td>\n",
       "            <div class=\" \"\n",
       "                 title='Found'>Found\n",
       "            </div>\n",
       "        </td>\n",
       "        \n",
       "        <td>\n",
       "            <div class=\" \"\n",
       "                 title='01/01/2001&#8901;12:00:00&#8901;AM'>01/01/2001&#8901;12:00:00&#8901;AM\n",
       "            </div>\n",
       "        </td>\n",
       "        \n",
       "        <td>\n",
       "            <div class=\" \"\n",
       "                 title='21.06667'>21.06667\n",
       "            </div>\n",
       "        </td>\n",
       "        \n",
       "        <td>\n",
       "            <div class=\" \"\n",
       "                 title='57.31667'>57.31667\n",
       "            </div>\n",
       "        </td>\n",
       "        \n",
       "        <td>\n",
       "            <div class=\" \"\n",
       "                 title='(21.066670,&#8901;57.316670)'>(21.066670,&#8901;57.316670)\n",
       "            </div>\n",
       "        </td>\n",
       "        \n",
       "    </tr>\n",
       "    \n",
       "    <tr>\n",
       "        \n",
       "        <td>\n",
       "            <div class=\" \"\n",
       "                 title='Northwest&#8901;Africa&#8901;3088'>Northwest&#8901;Africa&#8901;3088\n",
       "            </div>\n",
       "        </td>\n",
       "        \n",
       "        <td>\n",
       "            <div class=\" \"\n",
       "                 title='31218'>31218\n",
       "            </div>\n",
       "        </td>\n",
       "        \n",
       "        <td>\n",
       "            <div class=\" \"\n",
       "                 title='Valid'>Valid\n",
       "            </div>\n",
       "        </td>\n",
       "        \n",
       "        <td>\n",
       "            <div class=\" \"\n",
       "                 title='L6'>L6\n",
       "            </div>\n",
       "        </td>\n",
       "        \n",
       "        <td>\n",
       "            <div class=\" \"\n",
       "                 title='171.0'>171.0\n",
       "            </div>\n",
       "        </td>\n",
       "        \n",
       "        <td>\n",
       "            <div class=\" \"\n",
       "                 title='Found'>Found\n",
       "            </div>\n",
       "        </td>\n",
       "        \n",
       "        <td>\n",
       "            <div class=\" \"\n",
       "                 title='01/01/2003&#8901;12:00:00&#8901;AM'>01/01/2003&#8901;12:00:00&#8901;AM\n",
       "            </div>\n",
       "        </td>\n",
       "        \n",
       "        <td>\n",
       "            <div class=\"none \"\n",
       "                 title='None'>None\n",
       "            </div>\n",
       "        </td>\n",
       "        \n",
       "        <td>\n",
       "            <div class=\"none \"\n",
       "                 title='None'>None\n",
       "            </div>\n",
       "        </td>\n",
       "        \n",
       "        <td>\n",
       "            <div class=\"none \"\n",
       "                 title='None'>None\n",
       "            </div>\n",
       "        </td>\n",
       "        \n",
       "    </tr>\n",
       "    \n",
       "    <tr>\n",
       "        \n",
       "        <td>\n",
       "            <div class=\" \"\n",
       "                 title='Reckling&#8901;Peak&#8901;92423'>Reckling&#8901;Peak&#8901;92423\n",
       "            </div>\n",
       "        </td>\n",
       "        \n",
       "        <td>\n",
       "            <div class=\" \"\n",
       "                 title='22432'>22432\n",
       "            </div>\n",
       "        </td>\n",
       "        \n",
       "        <td>\n",
       "            <div class=\" \"\n",
       "                 title='Valid'>Valid\n",
       "            </div>\n",
       "        </td>\n",
       "        \n",
       "        <td>\n",
       "            <div class=\" \"\n",
       "                 title='L6'>L6\n",
       "            </div>\n",
       "        </td>\n",
       "        \n",
       "        <td>\n",
       "            <div class=\" \"\n",
       "                 title='3.8'>3.8\n",
       "            </div>\n",
       "        </td>\n",
       "        \n",
       "        <td>\n",
       "            <div class=\" \"\n",
       "                 title='Found'>Found\n",
       "            </div>\n",
       "        </td>\n",
       "        \n",
       "        <td>\n",
       "            <div class=\" \"\n",
       "                 title='01/01/1992&#8901;12:00:00&#8901;AM'>01/01/1992&#8901;12:00:00&#8901;AM\n",
       "            </div>\n",
       "        </td>\n",
       "        \n",
       "        <td>\n",
       "            <div class=\" \"\n",
       "                 title='-76.22029'>-76.22029\n",
       "            </div>\n",
       "        </td>\n",
       "        \n",
       "        <td>\n",
       "            <div class=\" \"\n",
       "                 title='158.37967'>158.37967\n",
       "            </div>\n",
       "        </td>\n",
       "        \n",
       "        <td>\n",
       "            <div class=\" \"\n",
       "                 title='(-76.220290,&#8901;158.379670)'>(-76.220290,&#8901;158.379670)\n",
       "            </div>\n",
       "        </td>\n",
       "        \n",
       "    </tr>\n",
       "    \n",
       "    <tr>\n",
       "        \n",
       "        <td>\n",
       "            <div class=\" \"\n",
       "                 title='Sweetwater'>Sweetwater\n",
       "            </div>\n",
       "        </td>\n",
       "        \n",
       "        <td>\n",
       "            <div class=\" \"\n",
       "                 title='23770'>23770\n",
       "            </div>\n",
       "        </td>\n",
       "        \n",
       "        <td>\n",
       "            <div class=\" \"\n",
       "                 title='Valid'>Valid\n",
       "            </div>\n",
       "        </td>\n",
       "        \n",
       "        <td>\n",
       "            <div class=\" \"\n",
       "                 title='H5'>H5\n",
       "            </div>\n",
       "        </td>\n",
       "        \n",
       "        <td>\n",
       "            <div class=\" \"\n",
       "                 title='1760.0'>1760.0\n",
       "            </div>\n",
       "        </td>\n",
       "        \n",
       "        <td>\n",
       "            <div class=\" \"\n",
       "                 title='Found'>Found\n",
       "            </div>\n",
       "        </td>\n",
       "        \n",
       "        <td>\n",
       "            <div class=\" \"\n",
       "                 title='01/01/1961&#8901;12:00:00&#8901;AM'>01/01/1961&#8901;12:00:00&#8901;AM\n",
       "            </div>\n",
       "        </td>\n",
       "        \n",
       "        <td>\n",
       "            <div class=\" \"\n",
       "                 title='32.55'>32.55\n",
       "            </div>\n",
       "        </td>\n",
       "        \n",
       "        <td>\n",
       "            <div class=\" \"\n",
       "                 title='-100.41667'>-100.41667\n",
       "            </div>\n",
       "        </td>\n",
       "        \n",
       "        <td>\n",
       "            <div class=\" \"\n",
       "                 title='(32.550000,&#8901;-100.416670)'>(32.550000,&#8901;-100.416670)\n",
       "            </div>\n",
       "        </td>\n",
       "        \n",
       "    </tr>\n",
       "    \n",
       "    </tbody>\n",
       "</table>\n",
       "\n",
       "\n",
       "<div class=\"info_items\">Viewing 10 of 45716 rows / 10 columns</div>\n",
       "<div class=\"info_items\">32 partition(s)</div>\n"
      ],
      "text/plain": [
       "<IPython.core.display.HTML object>"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "INFO:optimus:run() executed in 96.23 sec\n"
     ]
    }
   ],
   "source": [
    "op.profiler.run(df, \"year\", infer=True)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 29,
   "metadata": {
    "scrolled": true
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Loading page (1/2)\n",
      "Warning: Failed to load file:///C:/Users/ARGENI~1/AppData/Local/Temp/optimus/styles/styles.css (ignore)\n",
      "Rendering (2/2)                                                    \n",
      "Done                                                               \n"
     ]
    },
    {
     "data": {
      "text/html": [
       "<img src='images/profiler1.png'>"
      ],
      "text/plain": [
       "<IPython.core.display.HTML object>"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    }
   ],
   "source": [
    "op.profiler.to_image(output_path=\"images/profiler1.png\")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Profiler Speed"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "With **relative_error** and **approx_count** params you can control how some operations are caculated so you can speedup the profiling in case is needed.\n",
    "\n",
    "relative_error: Relative Error for quantile discretizer calculation. 1 is Faster, 0 Slower\n",
    "\n",
    "approx_count: Use ```approx_count_distinct``` or ```countDistinct```. ```approx_count_distinct``` is faster"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 30,
   "metadata": {},
   "outputs": [
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "INFO:optimus:Processing column 'mass (g)'...\n",
      "INFO:optimus:_count_data_types() executed in 1.33 sec\n",
      "INFO:optimus:count_data_types() executed in 1.33 sec\n",
      "INFO:optimus:cast_columns() executed in 0.0 sec\n",
      "INFO:optimus:agg_exprs() executed in 1.64 sec\n",
      "INFO:optimus:general_stats() executed in 1.65 sec\n",
      "INFO:optimus:------------------------------\n",
      "INFO:optimus:Processing column 'mass (g)'...\n",
      "INFO:optimus:frequency() executed in 3.56 sec\n",
      "INFO:optimus:stats_by_column() executed in 0.0 sec\n",
      "INFO:optimus:Using 'column_exp' to process column 'mass (g)' with function _cast_to\n",
      "INFO:optimus:percentile() executed in 0.23 sec\n",
      "INFO:optimus:Using 'column_exp' to process column 'mass (g)' with function _cast_to\n",
      "INFO:optimus:Using 'column_exp' to process column 'mass (g)' with function _cast_to\n",
      "INFO:optimus:extra_numeric_stats() executed in 0.58 sec\n",
      "INFO:optimus:Using 'column_exp' to process column 'mass (g)' with function _bucketizer\n",
      "INFO:optimus:bucketizer() executed in 0.3 sec\n",
      "INFO:optimus:hist() executed in 2.04 sec\n"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Including 'nan' as Null in processing 'name'\n",
      "Including 'nan' as Null in processing 'nametype'\n",
      "Including 'nan' as Null in processing 'recclass'\n",
      "Including 'nan' as Null in processing 'fall'\n",
      "Including 'nan' as Null in processing 'year'\n",
      "Including 'nan' as Null in processing 'GeoLocation'\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "INFO:optimus:dataset_info() executed in 1.75 sec\n"
     ]
    },
    {
     "data": {
      "text/html": [
       "\n",
       "<div class=\"title_profiler\">\n",
       "    <h1>Overview</h1>\n",
       "</div>\n",
       "<div class=\"main\">\n",
       "    <div class=\"panel_profiler\">\n",
       "        <h2>Dataset info</h2>\n",
       "        <table>\n",
       "            <tbody>\n",
       "            <tr>\n",
       "                <td>Number of columns</td>\n",
       "                <td>10</td>\n",
       "\n",
       "            </tr>\n",
       "            <tr>\n",
       "                <td>Number of rows</td>\n",
       "                <td>45716</td>\n",
       "\n",
       "            </tr>\n",
       "            <tr>\n",
       "                <td>Total Missing (%)</td>\n",
       "                <td>0.49%</td>\n",
       "\n",
       "            </tr>\n",
       "            <tr>\n",
       "                <td>Total size in memory</td>\n",
       "                <td>98.2 MB</td>\n",
       "\n",
       "            </tr>\n",
       "            </tbody>\n",
       "        </table>\n",
       "    </div>\n",
       "    <div class=\"panel_profiler\">\n",
       "        <h2>Column types</h2>\n",
       "        <table>\n",
       "            <tbody>\n",
       "            <tr>\n",
       "                <td>String</td>\n",
       "                <td>0</td>\n",
       "\n",
       "            </tr>\n",
       "            <tr>\n",
       "                <td>Numeric</td>\n",
       "                <td>1</td>\n",
       "\n",
       "            </tr>\n",
       "            <tr>\n",
       "                <td>Date</td>\n",
       "                <td>0</td>\n",
       "\n",
       "            </tr>\n",
       "            <tr>\n",
       "                <td>Bool</td>\n",
       "                <td>0</td>\n",
       "\n",
       "            </tr>\n",
       "            <tr>\n",
       "                <td>Array</td>\n",
       "                <td>0</td>\n",
       "\n",
       "            </tr>\n",
       "            <tr>\n",
       "                <td>Not available</td>\n",
       "                <td>0</td>\n",
       "\n",
       "            </tr>\n",
       "            </tbody>\n",
       "        </table>\n",
       "    </div>\n",
       "</div><link rel=\"stylesheet\" type=\"text/css\" href=\"optimus/styles/styles.css\">\n",
       "\n",
       "\n",
       "<div class=\"main\">\n",
       "    <div class=\"info\">\n",
       "\n",
       "        \n",
       "\n",
       "        <div class=\"panel_profiler\">\n",
       "            <div>\n",
       "                <h2>mass (g)</h2>\n",
       "                <span>numeric</span>\n",
       "            </div>\n",
       "            <table>\n",
       "                <tbody>\n",
       "                <tr>\n",
       "                    <td>Unique</td>\n",
       "                    <td> 12497</td>\n",
       "                </tr>\n",
       "                <tr>\n",
       "                    <td>Unique (%)</td>\n",
       "                    <td> 27.336</td>\n",
       "                </tr>\n",
       "                <tr>\n",
       "                    <td>Missing</td>\n",
       "                    <td>0.0</td>\n",
       "                </tr>\n",
       "                <tr>\n",
       "                    <td>Missing (%)</td>\n",
       "                    <td>0</td>\n",
       "                </tr>\n",
       "                </tbody>\n",
       "            </table>\n",
       "            <div>\n",
       "                <h3>\n",
       "                    Datatypes\n",
       "                </h3>\n",
       "            </div>\n",
       "            <table>\n",
       "                <tbody>\n",
       "                <tr>\n",
       "                    <td>\n",
       "                        String\n",
       "                    </td>\n",
       "                    <td>\n",
       "                        0\n",
       "                    </td>\n",
       "                </tr>\n",
       "                <tr>\n",
       "                    <td>\n",
       "                        Integer\n",
       "                    </td>\n",
       "                    <td>\n",
       "                        0\n",
       "                    </td>\n",
       "                </tr>\n",
       "                <tr>\n",
       "                    <td>\n",
       "                        Float\n",
       "                    </td>\n",
       "                    <td>\n",
       "                        0\n",
       "                    </td>\n",
       "                </tr>\n",
       "                <tr>\n",
       "                    <td>\n",
       "                        Bool\n",
       "                    </td>\n",
       "                    <td>\n",
       "                        0\n",
       "                    </td>\n",
       "                </tr>\n",
       "                <tr>\n",
       "                    <td>\n",
       "                        Date\n",
       "                    </td>\n",
       "                    <td>\n",
       "                        0\n",
       "                    </td>\n",
       "                </tr>\n",
       "                <tr>\n",
       "                    <td>\n",
       "                        Missing\n",
       "                    </td>\n",
       "                    <td>\n",
       "                        0\n",
       "                    </td>\n",
       "                </tr>\n",
       "                <tr>\n",
       "                    <td>\n",
       "                        Null\n",
       "                    </td>\n",
       "                    <td>\n",
       "                        131\n",
       "                    </td>\n",
       "\n",
       "                </tr>\n",
       "                </tbody>\n",
       "            </table>\n",
       "            \n",
       "            <div>\n",
       "                <h3>\n",
       "                    Basic Stats\n",
       "                </h3>\n",
       "\n",
       "            </div>\n",
       "            <table>\n",
       "                <tbody>\n",
       "                <tr>\n",
       "                    <td>Mean</td>\n",
       "                    <td>13278.07855</td>\n",
       "                </tr>\n",
       "                <tr>\n",
       "                    <td>Minimum</td>\n",
       "                    <td>0.0</td>\n",
       "                </tr>\n",
       "                <tr>\n",
       "                    <td>Maximum</td>\n",
       "                    <td>60000000.0</td>\n",
       "                </tr>\n",
       "                <tr>\n",
       "                    <td>Zeros(%)</td>\n",
       "                    <td></td>\n",
       "                </tr>\n",
       "\n",
       "                </tbody>\n",
       "            </table>\n",
       "            \n",
       "\n",
       "        </div>\n",
       "        <div class=\"panel_profiler\">\n",
       "            <h3>Frequency</h3>\n",
       "            <table>\n",
       "\n",
       "                <tr>\n",
       "                    <th>Value</th>\n",
       "                    <th>Count</th>\n",
       "                    <th>Frequency (%)</th>\n",
       "                </tr>\n",
       "                \n",
       "                <tr>\n",
       "                    <td>1.3</td>\n",
       "                    <td>171</td>\n",
       "                    <td>0.374%</td>\n",
       "                </tr>\n",
       "\n",
       "                \n",
       "                <tr>\n",
       "                    <td>1.2</td>\n",
       "                    <td>140</td>\n",
       "                    <td>0.306%</td>\n",
       "                </tr>\n",
       "\n",
       "                \n",
       "                <tr>\n",
       "                    <td>1.4</td>\n",
       "                    <td>138</td>\n",
       "                    <td>0.302%</td>\n",
       "                </tr>\n",
       "\n",
       "                \n",
       "                <tr>\n",
       "                    <td>None</td>\n",
       "                    <td>131</td>\n",
       "                    <td>0.287%</td>\n",
       "                </tr>\n",
       "\n",
       "                \n",
       "                <tr>\n",
       "                    <td>2.1</td>\n",
       "                    <td>130</td>\n",
       "                    <td>0.284%</td>\n",
       "                </tr>\n",
       "\n",
       "                \n",
       "                <tr>\n",
       "                    <td>2.4</td>\n",
       "                    <td>126</td>\n",
       "                    <td>0.276%</td>\n",
       "                </tr>\n",
       "\n",
       "                \n",
       "                <tr>\n",
       "                    <td>1.6</td>\n",
       "                    <td>120</td>\n",
       "                    <td>0.262%</td>\n",
       "                </tr>\n",
       "\n",
       "                \n",
       "                <tr>\n",
       "                    <td>0.5</td>\n",
       "                    <td>119</td>\n",
       "                    <td>0.26%</td>\n",
       "                </tr>\n",
       "\n",
       "                \n",
       "                <tr>\n",
       "                    <td>1.1</td>\n",
       "                    <td>116</td>\n",
       "                    <td>0.254%</td>\n",
       "                </tr>\n",
       "\n",
       "                \n",
       "                <tr>\n",
       "                    <td>3.8</td>\n",
       "                    <td>114</td>\n",
       "                    <td>0.249%</td>\n",
       "                </tr>\n",
       "\n",
       "                \n",
       "                <tr>\n",
       "                    <td>\"Missing\"</td>\n",
       "                    <td>0</td>\n",
       "                    <td>0.0%</td>\n",
       "                </tr>\n",
       "\n",
       "            </table>\n",
       "        </div>\n",
       "        \n",
       "\n",
       "        \n",
       "        <div class=\"panel_profiler\">\n",
       "\n",
       "\n",
       "            <h3>Quantile statistics</h3>\n",
       "            <table>\n",
       "                <tbody>\n",
       "                <tr>\n",
       "                    <td>Minimum</td>\n",
       "                    <td>0.0</td>\n",
       "                </tr>\n",
       "                <tr>\n",
       "                    <td>5-th percentile</td>\n",
       "                    <td></td>\n",
       "                </tr>\n",
       "                <tr>\n",
       "                    <td>Q1</td>\n",
       "                    <td></td>\n",
       "                </tr>\n",
       "                <tr>\n",
       "                    <td>Median</td>\n",
       "                    <td></td>\n",
       "                </tr>\n",
       "                <tr>\n",
       "                    <td>Q3</td>\n",
       "                    <td></td>\n",
       "                </tr>\n",
       "                <tr>\n",
       "                    <td>95-th percentile</td>\n",
       "                    <td></td>\n",
       "                </tr>\n",
       "                <tr>\n",
       "                    <td>Maximum</td>\n",
       "                    <td>60000000.0</td>\n",
       "                </tr>\n",
       "                <tr>\n",
       "                    <td>Range</td>\n",
       "                    <td>60000000.0</td>\n",
       "                </tr>\n",
       "                <tr>\n",
       "                    <td>Interquartile range</td>\n",
       "                    <td>0.0</td>\n",
       "                </tr>\n",
       "                </tbody>\n",
       "            </table>\n",
       "        </div>\n",
       "        <div class=\"panel_profiler\">\n",
       "            <h3>Descriptive statistics</h3>\n",
       "            <table>\n",
       "                <tbody>\n",
       "                <tr>\n",
       "                    <td>Standard deviation</td>\n",
       "                    <td>574988.87641</td>\n",
       "                </tr>\n",
       "                <tr>\n",
       "                    <td>Coef of variation</td>\n",
       "                    <td>43.30362</td>\n",
       "                </tr>\n",
       "                <tr>\n",
       "                    <td>Kurtosis</td>\n",
       "                    <td>6796.17061</td>\n",
       "                </tr>\n",
       "                <tr>\n",
       "                    <td>Mean</td>\n",
       "                    <td>13278.07855</td>\n",
       "                </tr>\n",
       "                <tr>\n",
       "                    <td>MAD</td>\n",
       "                    <td>0.0</td>\n",
       "                </tr>\n",
       "                <tr>\n",
       "                    <td>Skewness</td>\n",
       "                    <td></td>\n",
       "                </tr>\n",
       "                <tr>\n",
       "                    <td>Sum</td>\n",
       "                    <td>605281210.638</td>\n",
       "                </tr>\n",
       "                <tr>\n",
       "                    <td>Variance</td>\n",
       "                    <td>330612207995.7785</td>\n",
       "                </tr>\n",
       "                </tbody>\n",
       "            </table>\n",
       "        </div>\n",
       "        \n",
       "    </div>\n",
       "    <table>\n",
       "        \n",
       "        <tr>\n",
       "\n",
       "            <td>\n",
       "\n",
       "                <div>\n",
       "                    <img src=\"\">\n",
       "                </div>\n",
       "            </td>\n",
       "        </tr>\n",
       "        \n",
       "        \n",
       "        <tr>\n",
       "            <td>\n",
       "                <div>\n",
       "                    <img src=\"\">\n",
       "                </div>\n",
       "\n",
       "            </td>\n",
       "\n",
       "        </tr>\n",
       "        \n",
       "        \n",
       "        \n",
       "        \n",
       "        \n",
       "        \n",
       "\n",
       "    </table>\n",
       "</div>\n",
       "\n",
       "\n",
       "\n",
       "\n",
       "<div class=\"info_items\">Viewing 10 of 45716 rows / 10 columns</div>\n",
       "<div class=\"info_items\">32 partition(s)</div>\n",
       "\n",
       "<table class=\"optimus_table\">\n",
       "    <thead>\n",
       "    <tr>\n",
       "        \n",
       "        <th>\n",
       "            <div class=\"column_name\">name</div>\n",
       "            <div class=\"data_type\">1 (string)</div>\n",
       "            <div class=\"data_type\">\n",
       "                \n",
       "                nullable\n",
       "                \n",
       "            </div>\n",
       "        </th>\n",
       "        \n",
       "        <th>\n",
       "            <div class=\"column_name\">id</div>\n",
       "            <div class=\"data_type\">2 (int)</div>\n",
       "            <div class=\"data_type\">\n",
       "                \n",
       "                nullable\n",
       "                \n",
       "            </div>\n",
       "        </th>\n",
       "        \n",
       "        <th>\n",
       "            <div class=\"column_name\">nametype</div>\n",
       "            <div class=\"data_type\">3 (string)</div>\n",
       "            <div class=\"data_type\">\n",
       "                \n",
       "                nullable\n",
       "                \n",
       "            </div>\n",
       "        </th>\n",
       "        \n",
       "        <th>\n",
       "            <div class=\"column_name\">recclass</div>\n",
       "            <div class=\"data_type\">4 (string)</div>\n",
       "            <div class=\"data_type\">\n",
       "                \n",
       "                nullable\n",
       "                \n",
       "            </div>\n",
       "        </th>\n",
       "        \n",
       "        <th>\n",
       "            <div class=\"column_name\">mass (g)</div>\n",
       "            <div class=\"data_type\">5 (double)</div>\n",
       "            <div class=\"data_type\">\n",
       "                \n",
       "                nullable\n",
       "                \n",
       "            </div>\n",
       "        </th>\n",
       "        \n",
       "        <th>\n",
       "            <div class=\"column_name\">fall</div>\n",
       "            <div class=\"data_type\">6 (string)</div>\n",
       "            <div class=\"data_type\">\n",
       "                \n",
       "                nullable\n",
       "                \n",
       "            </div>\n",
       "        </th>\n",
       "        \n",
       "        <th>\n",
       "            <div class=\"column_name\">year</div>\n",
       "            <div class=\"data_type\">7 (string)</div>\n",
       "            <div class=\"data_type\">\n",
       "                \n",
       "                nullable\n",
       "                \n",
       "            </div>\n",
       "        </th>\n",
       "        \n",
       "        <th>\n",
       "            <div class=\"column_name\">reclat</div>\n",
       "            <div class=\"data_type\">8 (double)</div>\n",
       "            <div class=\"data_type\">\n",
       "                \n",
       "                nullable\n",
       "                \n",
       "            </div>\n",
       "        </th>\n",
       "        \n",
       "        <th>\n",
       "            <div class=\"column_name\">reclong</div>\n",
       "            <div class=\"data_type\">9 (double)</div>\n",
       "            <div class=\"data_type\">\n",
       "                \n",
       "                nullable\n",
       "                \n",
       "            </div>\n",
       "        </th>\n",
       "        \n",
       "        <th>\n",
       "            <div class=\"column_name\">GeoLocation</div>\n",
       "            <div class=\"data_type\">10 (string)</div>\n",
       "            <div class=\"data_type\">\n",
       "                \n",
       "                nullable\n",
       "                \n",
       "            </div>\n",
       "        </th>\n",
       "        \n",
       "    </tr>\n",
       "\n",
       "    </thead>\n",
       "    <tbody>\n",
       "    \n",
       "    <tr>\n",
       "        \n",
       "        <td>\n",
       "            <div class=\" \"\n",
       "                 title='Acfer&#8901;232'>Acfer&#8901;232\n",
       "            </div>\n",
       "        </td>\n",
       "        \n",
       "        <td>\n",
       "            <div class=\" \"\n",
       "                 title='240'>240\n",
       "            </div>\n",
       "        </td>\n",
       "        \n",
       "        <td>\n",
       "            <div class=\" \"\n",
       "                 title='Valid'>Valid\n",
       "            </div>\n",
       "        </td>\n",
       "        \n",
       "        <td>\n",
       "            <div class=\" \"\n",
       "                 title='H5'>H5\n",
       "            </div>\n",
       "        </td>\n",
       "        \n",
       "        <td>\n",
       "            <div class=\" \"\n",
       "                 title='725.0'>725.0\n",
       "            </div>\n",
       "        </td>\n",
       "        \n",
       "        <td>\n",
       "            <div class=\" \"\n",
       "                 title='Found'>Found\n",
       "            </div>\n",
       "        </td>\n",
       "        \n",
       "        <td>\n",
       "            <div class=\" \"\n",
       "                 title='01/01/1991&#8901;12:00:00&#8901;AM'>01/01/1991&#8901;12:00:00&#8901;AM\n",
       "            </div>\n",
       "        </td>\n",
       "        \n",
       "        <td>\n",
       "            <div class=\" \"\n",
       "                 title='27.73944'>27.73944\n",
       "            </div>\n",
       "        </td>\n",
       "        \n",
       "        <td>\n",
       "            <div class=\" \"\n",
       "                 title='4.32833'>4.32833\n",
       "            </div>\n",
       "        </td>\n",
       "        \n",
       "        <td>\n",
       "            <div class=\" \"\n",
       "                 title='(27.739440,&#8901;4.328330)'>(27.739440,&#8901;4.328330)\n",
       "            </div>\n",
       "        </td>\n",
       "        \n",
       "    </tr>\n",
       "    \n",
       "    <tr>\n",
       "        \n",
       "        <td>\n",
       "            <div class=\" \"\n",
       "                 title='Asuka&#8901;87197'>Asuka&#8901;87197\n",
       "            </div>\n",
       "        </td>\n",
       "        \n",
       "        <td>\n",
       "            <div class=\" \"\n",
       "                 title='2554'>2554\n",
       "            </div>\n",
       "        </td>\n",
       "        \n",
       "        <td>\n",
       "            <div class=\" \"\n",
       "                 title='Valid'>Valid\n",
       "            </div>\n",
       "        </td>\n",
       "        \n",
       "        <td>\n",
       "            <div class=\" \"\n",
       "                 title='H4'>H4\n",
       "            </div>\n",
       "        </td>\n",
       "        \n",
       "        <td>\n",
       "            <div class=\" \"\n",
       "                 title='124.99'>124.99\n",
       "            </div>\n",
       "        </td>\n",
       "        \n",
       "        <td>\n",
       "            <div class=\" \"\n",
       "                 title='Found'>Found\n",
       "            </div>\n",
       "        </td>\n",
       "        \n",
       "        <td>\n",
       "            <div class=\" \"\n",
       "                 title='01/01/1987&#8901;12:00:00&#8901;AM'>01/01/1987&#8901;12:00:00&#8901;AM\n",
       "            </div>\n",
       "        </td>\n",
       "        \n",
       "        <td>\n",
       "            <div class=\" \"\n",
       "                 title='-72.0'>-72.0\n",
       "            </div>\n",
       "        </td>\n",
       "        \n",
       "        <td>\n",
       "            <div class=\" \"\n",
       "                 title='26.0'>26.0\n",
       "            </div>\n",
       "        </td>\n",
       "        \n",
       "        <td>\n",
       "            <div class=\" \"\n",
       "                 title='(-72.000000,&#8901;26.000000)'>(-72.000000,&#8901;26.000000)\n",
       "            </div>\n",
       "        </td>\n",
       "        \n",
       "    </tr>\n",
       "    \n",
       "    <tr>\n",
       "        \n",
       "        <td>\n",
       "            <div class=\" \"\n",
       "                 title='Gladstone&#8901;(iron)'>Gladstone&#8901;(iron)\n",
       "            </div>\n",
       "        </td>\n",
       "        \n",
       "        <td>\n",
       "            <div class=\" \"\n",
       "                 title='10920'>10920\n",
       "            </div>\n",
       "        </td>\n",
       "        \n",
       "        <td>\n",
       "            <div class=\" \"\n",
       "                 title='Valid'>Valid\n",
       "            </div>\n",
       "        </td>\n",
       "        \n",
       "        <td>\n",
       "            <div class=\" \"\n",
       "                 title='Iron,&#8901;IAB-MG'>Iron,&#8901;IAB-MG\n",
       "            </div>\n",
       "        </td>\n",
       "        \n",
       "        <td>\n",
       "            <div class=\" \"\n",
       "                 title='736600.0'>736600.0\n",
       "            </div>\n",
       "        </td>\n",
       "        \n",
       "        <td>\n",
       "            <div class=\" \"\n",
       "                 title='Found'>Found\n",
       "            </div>\n",
       "        </td>\n",
       "        \n",
       "        <td>\n",
       "            <div class=\" \"\n",
       "                 title='01/01/1915&#8901;12:00:00&#8901;AM'>01/01/1915&#8901;12:00:00&#8901;AM\n",
       "            </div>\n",
       "        </td>\n",
       "        \n",
       "        <td>\n",
       "            <div class=\" \"\n",
       "                 title='-23.9'>-23.9\n",
       "            </div>\n",
       "        </td>\n",
       "        \n",
       "        <td>\n",
       "            <div class=\" \"\n",
       "                 title='151.3'>151.3\n",
       "            </div>\n",
       "        </td>\n",
       "        \n",
       "        <td>\n",
       "            <div class=\" \"\n",
       "                 title='(-23.900000,&#8901;151.300000)'>(-23.900000,&#8901;151.300000)\n",
       "            </div>\n",
       "        </td>\n",
       "        \n",
       "    </tr>\n",
       "    \n",
       "    <tr>\n",
       "        \n",
       "        <td>\n",
       "            <div class=\" \"\n",
       "                 title='Nullarbor&#8901;015'>Nullarbor&#8901;015\n",
       "            </div>\n",
       "        </td>\n",
       "        \n",
       "        <td>\n",
       "            <div class=\" \"\n",
       "                 title='17955'>17955\n",
       "            </div>\n",
       "        </td>\n",
       "        \n",
       "        <td>\n",
       "            <div class=\" \"\n",
       "                 title='Valid'>Valid\n",
       "            </div>\n",
       "        </td>\n",
       "        \n",
       "        <td>\n",
       "            <div class=\" \"\n",
       "                 title='L6'>L6\n",
       "            </div>\n",
       "        </td>\n",
       "        \n",
       "        <td>\n",
       "            <div class=\" \"\n",
       "                 title='3986.0'>3986.0\n",
       "            </div>\n",
       "        </td>\n",
       "        \n",
       "        <td>\n",
       "            <div class=\" \"\n",
       "                 title='Found'>Found\n",
       "            </div>\n",
       "        </td>\n",
       "        \n",
       "        <td>\n",
       "            <div class=\" \"\n",
       "                 title='01/01/1980&#8901;12:00:00&#8901;AM'>01/01/1980&#8901;12:00:00&#8901;AM\n",
       "            </div>\n",
       "        </td>\n",
       "        \n",
       "        <td>\n",
       "            <div class=\"none \"\n",
       "                 title='None'>None\n",
       "            </div>\n",
       "        </td>\n",
       "        \n",
       "        <td>\n",
       "            <div class=\"none \"\n",
       "                 title='None'>None\n",
       "            </div>\n",
       "        </td>\n",
       "        \n",
       "        <td>\n",
       "            <div class=\"none \"\n",
       "                 title='None'>None\n",
       "            </div>\n",
       "        </td>\n",
       "        \n",
       "    </tr>\n",
       "    \n",
       "    <tr>\n",
       "        \n",
       "        <td>\n",
       "            <div class=\" \"\n",
       "                 title='Lewis&#8901;Cliff&#8901;86533'>Lewis&#8901;Cliff&#8901;86533\n",
       "            </div>\n",
       "        </td>\n",
       "        \n",
       "        <td>\n",
       "            <div class=\" \"\n",
       "                 title='13461'>13461\n",
       "            </div>\n",
       "        </td>\n",
       "        \n",
       "        <td>\n",
       "            <div class=\" \"\n",
       "                 title='Valid'>Valid\n",
       "            </div>\n",
       "        </td>\n",
       "        \n",
       "        <td>\n",
       "            <div class=\" \"\n",
       "                 title='H5'>H5\n",
       "            </div>\n",
       "        </td>\n",
       "        \n",
       "        <td>\n",
       "            <div class=\" \"\n",
       "                 title='15.7'>15.7\n",
       "            </div>\n",
       "        </td>\n",
       "        \n",
       "        <td>\n",
       "            <div class=\" \"\n",
       "                 title='Found'>Found\n",
       "            </div>\n",
       "        </td>\n",
       "        \n",
       "        <td>\n",
       "            <div class=\" \"\n",
       "                 title='01/01/1986&#8901;12:00:00&#8901;AM'>01/01/1986&#8901;12:00:00&#8901;AM\n",
       "            </div>\n",
       "        </td>\n",
       "        \n",
       "        <td>\n",
       "            <div class=\" \"\n",
       "                 title='-84.26756'>-84.26756\n",
       "            </div>\n",
       "        </td>\n",
       "        \n",
       "        <td>\n",
       "            <div class=\" \"\n",
       "                 title='161.3631'>161.3631\n",
       "            </div>\n",
       "        </td>\n",
       "        \n",
       "        <td>\n",
       "            <div class=\" \"\n",
       "                 title='(-84.267560,&#8901;161.363100)'>(-84.267560,&#8901;161.363100)\n",
       "            </div>\n",
       "        </td>\n",
       "        \n",
       "    </tr>\n",
       "    \n",
       "    <tr>\n",
       "        \n",
       "        <td>\n",
       "            <div class=\" \"\n",
       "                 title='Grove&#8901;Mountains&#8901;053589'>Grove&#8901;Mountains&#8901;053589\n",
       "            </div>\n",
       "        </td>\n",
       "        \n",
       "        <td>\n",
       "            <div class=\" \"\n",
       "                 title='48447'>48447\n",
       "            </div>\n",
       "        </td>\n",
       "        \n",
       "        <td>\n",
       "            <div class=\" \"\n",
       "                 title='Valid'>Valid\n",
       "            </div>\n",
       "        </td>\n",
       "        \n",
       "        <td>\n",
       "            <div class=\" \"\n",
       "                 title='L5'>L5\n",
       "            </div>\n",
       "        </td>\n",
       "        \n",
       "        <td>\n",
       "            <div class=\" \"\n",
       "                 title='1.4'>1.4\n",
       "            </div>\n",
       "        </td>\n",
       "        \n",
       "        <td>\n",
       "            <div class=\" \"\n",
       "                 title='Found'>Found\n",
       "            </div>\n",
       "        </td>\n",
       "        \n",
       "        <td>\n",
       "            <div class=\" \"\n",
       "                 title='01/01/2006&#8901;12:00:00&#8901;AM'>01/01/2006&#8901;12:00:00&#8901;AM\n",
       "            </div>\n",
       "        </td>\n",
       "        \n",
       "        <td>\n",
       "            <div class=\" \"\n",
       "                 title='-72.7825'>-72.7825\n",
       "            </div>\n",
       "        </td>\n",
       "        \n",
       "        <td>\n",
       "            <div class=\" \"\n",
       "                 title='75.300278'>75.300278\n",
       "            </div>\n",
       "        </td>\n",
       "        \n",
       "        <td>\n",
       "            <div class=\" \"\n",
       "                 title='(-72.782500,&#8901;75.300278)'>(-72.782500,&#8901;75.300278)\n",
       "            </div>\n",
       "        </td>\n",
       "        \n",
       "    </tr>\n",
       "    \n",
       "    <tr>\n",
       "        \n",
       "        <td>\n",
       "            <div class=\" \"\n",
       "                 title='Sayh&#8901;al&#8901;Uhaymir&#8901;108'>Sayh&#8901;al&#8901;Uhaymir&#8901;108\n",
       "            </div>\n",
       "        </td>\n",
       "        \n",
       "        <td>\n",
       "            <div class=\" \"\n",
       "                 title='23300'>23300\n",
       "            </div>\n",
       "        </td>\n",
       "        \n",
       "        <td>\n",
       "            <div class=\" \"\n",
       "                 title='Valid'>Valid\n",
       "            </div>\n",
       "        </td>\n",
       "        \n",
       "        <td>\n",
       "            <div class=\" \"\n",
       "                 title='H6'>H6\n",
       "            </div>\n",
       "        </td>\n",
       "        \n",
       "        <td>\n",
       "            <div class=\" \"\n",
       "                 title='16.0'>16.0\n",
       "            </div>\n",
       "        </td>\n",
       "        \n",
       "        <td>\n",
       "            <div class=\" \"\n",
       "                 title='Found'>Found\n",
       "            </div>\n",
       "        </td>\n",
       "        \n",
       "        <td>\n",
       "            <div class=\" \"\n",
       "                 title='01/01/2001&#8901;12:00:00&#8901;AM'>01/01/2001&#8901;12:00:00&#8901;AM\n",
       "            </div>\n",
       "        </td>\n",
       "        \n",
       "        <td>\n",
       "            <div class=\" \"\n",
       "                 title='21.06667'>21.06667\n",
       "            </div>\n",
       "        </td>\n",
       "        \n",
       "        <td>\n",
       "            <div class=\" \"\n",
       "                 title='57.31667'>57.31667\n",
       "            </div>\n",
       "        </td>\n",
       "        \n",
       "        <td>\n",
       "            <div class=\" \"\n",
       "                 title='(21.066670,&#8901;57.316670)'>(21.066670,&#8901;57.316670)\n",
       "            </div>\n",
       "        </td>\n",
       "        \n",
       "    </tr>\n",
       "    \n",
       "    <tr>\n",
       "        \n",
       "        <td>\n",
       "            <div class=\" \"\n",
       "                 title='Northwest&#8901;Africa&#8901;3088'>Northwest&#8901;Africa&#8901;3088\n",
       "            </div>\n",
       "        </td>\n",
       "        \n",
       "        <td>\n",
       "            <div class=\" \"\n",
       "                 title='31218'>31218\n",
       "            </div>\n",
       "        </td>\n",
       "        \n",
       "        <td>\n",
       "            <div class=\" \"\n",
       "                 title='Valid'>Valid\n",
       "            </div>\n",
       "        </td>\n",
       "        \n",
       "        <td>\n",
       "            <div class=\" \"\n",
       "                 title='L6'>L6\n",
       "            </div>\n",
       "        </td>\n",
       "        \n",
       "        <td>\n",
       "            <div class=\" \"\n",
       "                 title='171.0'>171.0\n",
       "            </div>\n",
       "        </td>\n",
       "        \n",
       "        <td>\n",
       "            <div class=\" \"\n",
       "                 title='Found'>Found\n",
       "            </div>\n",
       "        </td>\n",
       "        \n",
       "        <td>\n",
       "            <div class=\" \"\n",
       "                 title='01/01/2003&#8901;12:00:00&#8901;AM'>01/01/2003&#8901;12:00:00&#8901;AM\n",
       "            </div>\n",
       "        </td>\n",
       "        \n",
       "        <td>\n",
       "            <div class=\"none \"\n",
       "                 title='None'>None\n",
       "            </div>\n",
       "        </td>\n",
       "        \n",
       "        <td>\n",
       "            <div class=\"none \"\n",
       "                 title='None'>None\n",
       "            </div>\n",
       "        </td>\n",
       "        \n",
       "        <td>\n",
       "            <div class=\"none \"\n",
       "                 title='None'>None\n",
       "            </div>\n",
       "        </td>\n",
       "        \n",
       "    </tr>\n",
       "    \n",
       "    <tr>\n",
       "        \n",
       "        <td>\n",
       "            <div class=\" \"\n",
       "                 title='Reckling&#8901;Peak&#8901;92423'>Reckling&#8901;Peak&#8901;92423\n",
       "            </div>\n",
       "        </td>\n",
       "        \n",
       "        <td>\n",
       "            <div class=\" \"\n",
       "                 title='22432'>22432\n",
       "            </div>\n",
       "        </td>\n",
       "        \n",
       "        <td>\n",
       "            <div class=\" \"\n",
       "                 title='Valid'>Valid\n",
       "            </div>\n",
       "        </td>\n",
       "        \n",
       "        <td>\n",
       "            <div class=\" \"\n",
       "                 title='L6'>L6\n",
       "            </div>\n",
       "        </td>\n",
       "        \n",
       "        <td>\n",
       "            <div class=\" \"\n",
       "                 title='3.8'>3.8\n",
       "            </div>\n",
       "        </td>\n",
       "        \n",
       "        <td>\n",
       "            <div class=\" \"\n",
       "                 title='Found'>Found\n",
       "            </div>\n",
       "        </td>\n",
       "        \n",
       "        <td>\n",
       "            <div class=\" \"\n",
       "                 title='01/01/1992&#8901;12:00:00&#8901;AM'>01/01/1992&#8901;12:00:00&#8901;AM\n",
       "            </div>\n",
       "        </td>\n",
       "        \n",
       "        <td>\n",
       "            <div class=\" \"\n",
       "                 title='-76.22029'>-76.22029\n",
       "            </div>\n",
       "        </td>\n",
       "        \n",
       "        <td>\n",
       "            <div class=\" \"\n",
       "                 title='158.37967'>158.37967\n",
       "            </div>\n",
       "        </td>\n",
       "        \n",
       "        <td>\n",
       "            <div class=\" \"\n",
       "                 title='(-76.220290,&#8901;158.379670)'>(-76.220290,&#8901;158.379670)\n",
       "            </div>\n",
       "        </td>\n",
       "        \n",
       "    </tr>\n",
       "    \n",
       "    <tr>\n",
       "        \n",
       "        <td>\n",
       "            <div class=\" \"\n",
       "                 title='Sweetwater'>Sweetwater\n",
       "            </div>\n",
       "        </td>\n",
       "        \n",
       "        <td>\n",
       "            <div class=\" \"\n",
       "                 title='23770'>23770\n",
       "            </div>\n",
       "        </td>\n",
       "        \n",
       "        <td>\n",
       "            <div class=\" \"\n",
       "                 title='Valid'>Valid\n",
       "            </div>\n",
       "        </td>\n",
       "        \n",
       "        <td>\n",
       "            <div class=\" \"\n",
       "                 title='H5'>H5\n",
       "            </div>\n",
       "        </td>\n",
       "        \n",
       "        <td>\n",
       "            <div class=\" \"\n",
       "                 title='1760.0'>1760.0\n",
       "            </div>\n",
       "        </td>\n",
       "        \n",
       "        <td>\n",
       "            <div class=\" \"\n",
       "                 title='Found'>Found\n",
       "            </div>\n",
       "        </td>\n",
       "        \n",
       "        <td>\n",
       "            <div class=\" \"\n",
       "                 title='01/01/1961&#8901;12:00:00&#8901;AM'>01/01/1961&#8901;12:00:00&#8901;AM\n",
       "            </div>\n",
       "        </td>\n",
       "        \n",
       "        <td>\n",
       "            <div class=\" \"\n",
       "                 title='32.55'>32.55\n",
       "            </div>\n",
       "        </td>\n",
       "        \n",
       "        <td>\n",
       "            <div class=\" \"\n",
       "                 title='-100.41667'>-100.41667\n",
       "            </div>\n",
       "        </td>\n",
       "        \n",
       "        <td>\n",
       "            <div class=\" \"\n",
       "                 title='(32.550000,&#8901;-100.416670)'>(32.550000,&#8901;-100.416670)\n",
       "            </div>\n",
       "        </td>\n",
       "        \n",
       "    </tr>\n",
       "    \n",
       "    </tbody>\n",
       "</table>\n",
       "\n",
       "\n",
       "<div class=\"info_items\">Viewing 10 of 45716 rows / 10 columns</div>\n",
       "<div class=\"info_items\">32 partition(s)</div>\n"
      ],
      "text/plain": [
       "<IPython.core.display.HTML object>"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "INFO:optimus:run() executed in 13.74 sec\n"
     ]
    }
   ],
   "source": [
    "op.profiler.run(df, \"mass (g)\", infer=False, relative_error =1, approx_count=True)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Plots\n",
    "Besides histograms and frequency plots you also have scatter plots and box plots. All powered by Apache by pyspark"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 31,
   "metadata": {},
   "outputs": [],
   "source": [
    "df = op.load.excel(\"../examples/data/titanic3.xls\")\n",
    "df = df.rows.drop_na([\"age\",\"fare\"])"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "You can output to the notebook or as an image"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 32,
   "metadata": {},
   "outputs": [
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "INFO:optimus:Using 'column_exp' to process column 'fare' with function _bucketizer\n",
      "INFO:optimus:bucketizer() executed in 0.11 sec\n",
      "INFO:optimus:hist() executed in 5.16 sec\n",
      "INFO:optimus:hist() executed in 9.98 sec\n"
     ]
    },
    {
     "data": {
      "text/html": [
       "<img src='images/hist.png'>"
      ],
      "text/plain": [
       "<IPython.core.display.HTML object>"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    }
   ],
   "source": [
    "# Output and image\n",
    "df.plot.hist(\"fare\", output_format=\"image\", output_path=\"images/hist.png\")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 33,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<img src='images/frequency.png'>"
      ],
      "text/plain": [
       "<IPython.core.display.HTML object>"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "data": {
      "image/png": "iVBORw0KGgoAAAANSUhEUgAAA1EAAAEUCAYAAADKj7xcAAAABHNCSVQICAgIfAhkiAAAAAlwSFlzAAALEgAACxIB0t1+/AAAADl0RVh0U29mdHdhcmUAbWF0cGxvdGxpYiB2ZXJzaW9uIDMuMC4zLCBodHRwOi8vbWF0cGxvdGxpYi5vcmcvnQurowAAHiBJREFUeJzt3X3creWc9/HPt3YPaNxKO3reoUyYYUgZQhiKSs0tphRbooZ7xvND42HwYshwT4hBmNGMGU0jg8bDrTHFcI+HQiNCMqmUnlMRqX7zx3FeLJe997WOva+rtXbr8369el1rrfNcax371zrXOr/ncZzHmapCkiRJkjSeDSbdAEmSJElanxiiJEmSJKmDIUqSJEmSOhiiJEmSJKmDIUqSJEmSOhiiJEmSJKmDIUqSJEmSOhiiJGnGJTk/yQ1Jrh/5b5tJt2sSkiz5xROT7JXk9KV+H0nS0jFESZIA9q+qzUb+u3j+CkmWTaJhkiRNG0OUJGmVkqxIUkmOSHIB8O/D4w9K8v+TXJPkrCR7jTxnpySfTXJdklOTvD3JB4ZleyW5aN57nJ/kD4bbGyQ5Osl5Sa5MclKSLea1ZWWSC5JckeTlI6+zYZKXDc+9LsmZSbZP8o4k/3fee56S5Hlj/PsPT3LO8HrfT3LUvOUvSXJJkouTPGNo3z2GZZskefPQ1kuTvCvJ7fr+D0iSppUhSpK0kIcDuwJ7J9kW+DjwOmAL4EXAyUmWD+v+I3AmsCXwWmBlx/s8BzhweL9tgKuBd8xbZ0/gnsCjgD9Psuvw+AuAQ4DHAXcEng78FDgBOCTJBgBJthye+8FVNaCqMnL3MmC/4fUOB45Ncv/hdfYZ3vMPgHsMbR71RmAX4H7D8m2BPx/e4/Sq2muMekiSppQhSpIE8JGhZ+maJB+Zt+zVVfWTqroBOAz4RFV9oqpuqapTgTOAxyXZAXgg8Mqq+nlVfQ44paMNRwEvr6qLqurnwKuBg+YNI3xNVd1QVWcBZwH3HR5/BvCKqvpONWdV1ZVV9WXgx7TgBHAwcHpVXbpQY6rq41V13vB6nwU+DTx0WPwk4G+r6ptV9VPgNXPPSxLgmcDzq+qqqroOeP3w3pKk2wDHt0uSAA6sqn9bzbILR27vCDwxyf4jj20EnMbQe1RVPxlZ9gNg+zHbsCPwL0luGXnsZuAuI/d/NHL7p8Bmw+3tgfNW87on0MLfqcPft47TmCSPBV5F61HaALg98I1h8Ta08DhntEbLh3XPbHmqvRyw4TjvK0mafoYoSdJCRmesuxD4+6p65vyVkuwIbJ7kDiNBaoeR5/+EFi7m1t+QFjhGX/vpVfWFVbz2igXaeCFwd+DsVSz7AHB2kvvShiXO72n7DUk2AU4Gngp8tKp+MfTQzaWiS4DtRp4yGhSvAG4A7l1VP1zovSRJ6x+H80mSenwA2D/J3sNkDpsOE0ZsV1U/oPXOvCbJxkn2BEZ7rL4LbJpk3yQbAa8ANhlZ/i7gL4YwRpLlSQ4Ys13vBV6bZOc0v5vkzgBVdRHwFeDvgZOHYYkL2Xho2+XATUOv1GNGlp8EHJ5k1yS3ZzjfaXi/W4D30M6h2mr4t2ybZO8x/y2SpClniJIkja2qLgQOAF5GCxgXAi/mV78nTwb2AK6iDYX7u5Hn/hh4Ni3w/JDWMzU6W99bgY8Bn05yHfDF4bXG8Ve0YPNp4FrgfcDobHgnAL9DC1Lj/Duvo010cRJtgosnD22bW/5J4G20YYzfA/5zWPTz4e9Lh8e/mORa4N9oE2JIkm4DUrXk1xWUJM2oJK8G7lFVh024HQ+j9aKtGHqKFvv1d6UNJdykqm5a7NeXJE0Xe6IkSbdpw9DB5wLvXcwAleQPh2GLm9OmND/FACVJs8EQJUm6zRp6iK4BtgbessgvfxRtSON5tFkEn7XIry9JmlIO55MkSZKkDvZESZIkSVIHQ5QkSZIkdbhVL7a75ZZb1ooVK27Nt5QkSZKksZx55plXVNXyhda7VUPUihUrOOOMM27Nt5QkSZKksST5wTjrOZxPkiRJkjoYoiRJkiSpgyFKkiRJkjoYoiRJkiSpgyFKkiRJkjoYoiRJkiSpgyFKkiRJkjoYoiRJkiSpw616sd1ptOLoj0+6Cbeq84/Zd9JNkCRJktZr9kRJkiRJUgdDlCRJkiR1MERJkiRJUgdDlCRJkiR1MERJkiRJUgdDlCRJkiR1MERJkiRJUgdDlCRJkiR1MERJkiRJUgdDlCRJkiR1MERJkiRJUgdDlCRJkiR1MERJkiRJUgdDlCRJkiR1MERJkiRJUgdDlCRJkiR1MERJkiRJUodlk26A1h8rjv74pJtwqzr/mH0n3QRJkiRNIXuiJEmSJKmDIUqSJEmSOhiiJEmSJKmDIUqSJEmSOhiiJEmSJKmDIUqSJEmSOhiiJEmSJKmDIUqSJEmSOox9sd0kGwJnAD+sqv2S7AScCGwBfBV4SlXduDTNlNYvs3ZhYvDixJIkaXb09EQ9Fzhn5P4bgWOramfgauCIxWyYJEmSJE2jsUJUku2AfYH3DvcDPBL40LDKCcCBS9FASZIkSZom4/ZEvQV4CXDLcP/OwDVVddNw/yJg20VumyRJkiRNnQXPiUqyH3BZVZ2ZZK+5h1exaq3m+UcCRwLssMMOa9lMSbdls3YOmeePSZK0fhunJ+ohwOOTnE+bSOKRtJ6pOyWZC2HbARev6slVdXxV7VZVuy1fvnwRmixJkiRJk7NgiKqqP6uq7apqBXAw8O9VdShwGnDQsNpK4KNL1kpJkiRJmhLrcp2olwIvSPI92jlS71ucJkmSJEnS9Br7OlEAVXU6cPpw+/vA7ovfJEnS6nj+mCRJk7cuPVGSJEmSNHMMUZIkSZLUwRAlSZIkSR0MUZIkSZLUwRAlSZIkSR0MUZIkSZLUwRAlSZIkSR0MUZIkSZLUwRAlSZIkSR0MUZIkSZLUwRAlSZIkSR0MUZIkSZLUwRAlSZIkSR0MUZIkSZLUwRAlSZIkSR0MUZIkSZLUwRAlSZIkSR0MUZIkSZLUwRAlSZIkSR0MUZIkSZLUwRAlSZIkSR2WTboBkiQthRVHf3zSTbjVnX/MvpNugiTNBHuiJEmSJKmDIUqSJEmSOhiiJEmSJKmDIUqSJEmSOhiiJEmSJKmDIUqSJEmSOhiiJEmSJKmDIUqSJEmSOnixXUmSNHMXJ/bCxJLWhT1RkiRJktTBECVJkiRJHQxRkiRJktTBc6IkSZI6eP6YpAV7opJsmuTLSc5K8s0krxke3ynJl5Kcm+Sfkmy89M2VJEmSpMkaZzjfz4FHVtV9gfsB+yR5EPBG4Niq2hm4Gjhi6ZopSZIkSdNhwRBVzfXD3Y2G/wp4JPCh4fETgAOXpIWSJEmSNEXGmlgiyYZJvg5cBpwKnAdcU1U3DatcBGy7NE2UJEmSpOkx1sQSVXUzcL8kdwL+Bdh1Vaut6rlJjgSOBNhhhx3WspmSJEla38zaJBywbhNxzFq91udJS7qmOK+qa4DTgQcBd0oyF8K2Ay5ezXOOr6rdqmq35cuXr0tbJUmSJGnixpmdb/nQA0WS2wF/AJwDnAYcNKy2EvjoUjVSkiRJkqbFOMP5tgZOSLIhLXSdVFX/muRbwIlJXgd8DXjfErZTkiRJkqbCgiGqqv4L+L1VPP59YPelaJQkSZIkTauuc6IkSZIkadYZoiRJkiSpgyFKkiRJkjoYoiRJkiSpgyFKkiRJkjoYoiRJkiSpgyFKkiRJkjoYoiRJkiSpgyFKkiRJkjoYoiRJkiSpgyFKkiRJkjoYoiRJkiSpgyFKkiRJkjoYoiRJkiSpgyFKkiRJkjoYoiRJkiSpgyFKkiRJkjoYoiRJkiSpgyFKkiRJkjoYoiRJkiSpgyFKkiRJkjoYoiRJkiSpgyFKkiRJkjoYoiRJkiSpgyFKkiRJkjoYoiRJkiSpgyFKkiRJkjoYoiRJkiSpgyFKkiRJkjoYoiRJkiSpgyFKkiRJkjoYoiRJkiSpgyFKkiRJkjoYoiRJkiSpgyFKkiRJkjosGKKSbJ/ktCTnJPlmkucOj2+R5NQk5w5/N1/65kqSJEnSZI3TE3UT8MKq2hV4EPB/ktwLOBr4TFXtDHxmuC9JkiRJt2kLhqiquqSqvjrcvg44B9gWOAA4YVjtBODApWqkJEmSJE2LrnOikqwAfg/4EnCXqroEWtACtlrNc45MckaSMy6//PJ1a60kSZIkTdjYISrJZsDJwPOq6tpxn1dVx1fVblW12/Lly9emjZIkSZI0NcYKUUk2ogWof6iqDw8PX5pk62H51sBlS9NESZIkSZoe48zOF+B9wDlV9Vcjiz4GrBxurwQ+uvjNkyRJkqTpsmyMdR4CPAX4RpKvD4+9DDgGOCnJEcAFwBOXpomSJEmSND0WDFFV9Xkgq1n8qMVtjiRJkiRNt67Z+SRJkiRp1hmiJEmSJKmDIUqSJEmSOhiiJEmSJKmDIUqSJEmSOhiiJEmSJKmDIUqSJEmSOhiiJEmSJKmDIUqSJEmSOhiiJEmSJKmDIUqSJEmSOhiiJEmSJKmDIUqSJEmSOhiiJEmSJKmDIUqSJEmSOhiiJEmSJKmDIUqSJEmSOhiiJEmSJKmDIUqSJEmSOhiiJEmSJKmDIUqSJEmSOhiiJEmSJKmDIUqSJEmSOhiiJEmSJKmDIUqSJEmSOhiiJEmSJKmDIUqSJEmSOhiiJEmSJKmDIUqSJEmSOhiiJEmSJKmDIUqSJEmSOhiiJEmSJKmDIUqSJEmSOhiiJEmSJKnDgiEqyd8kuSzJ2SOPbZHk1CTnDn83X9pmSpIkSdJ0GKcn6v3APvMeOxr4TFXtDHxmuC9JkiRJt3kLhqiq+hxw1byHDwBOGG6fABy4yO2SJEmSpKm0tudE3aWqLgEY/m61eE2SJEmSpOm15BNLJDkyyRlJzrj88suX+u0kSZIkaUmtbYi6NMnWAMPfy1a3YlUdX1W7VdVuy5cvX8u3kyRJkqTpsLYh6mPAyuH2SuCji9McSZIkSZpu40xx/kHgP4F7JrkoyRHAMcCjk5wLPHq4L0mSJEm3ecsWWqGqDlnNokctclskSZIkaeot+cQSkiRJknRbYoiSJEmSpA6GKEmSJEnqYIiSJEmSpA6GKEmSJEnqYIiSJEmSpA6GKEmSJEnqYIiSJEmSpA6GKEmSJEnqYIiSJEmSpA6GKEmSJEnqYIiSJEmSpA6GKEmSJEnqYIiSJEmSpA6GKEmSJEnqYIiSJEmSpA6GKEmSJEnqYIiSJEmSpA6GKEmSJEnqYIiSJEmSpA6GKEmSJEnqYIiSJEmSpA6GKEmSJEnqYIiSJEmSpA6GKEmSJEnqYIiSJEmSpA6GKEmSJEnqYIiSJEmSpA6GKEmSJEnqYIiSJEmSpA6GKEmSJEnqYIiSJEmSpA6GKEmSJEnqYIiSJEmSpA6GKEmSJEnqsE4hKsk+Sb6T5HtJjl6sRkmSJEnStFrrEJVkQ+AdwGOBewGHJLnXYjVMkiRJkqbRuvRE7Q58r6q+X1U3AicCByxOsyRJkiRpOq1LiNoWuHDk/kXDY5IkSZJ0m5WqWrsnJk8E9q6qZwz3nwLsXlV/Om+9I4Ejh7v3BL6z9s29TdkSuGLSjVhPWKs+1mt81mp81qqP9RqftRqftepjvcZnrX5lx6pavtBKy9bhDS4Cth+5vx1w8fyVqup44Ph1eJ/bpCRnVNVuk27H+sBa9bFe47NW47NWfazX+KzV+KxVH+s1PmvVb12G830F2DnJTkk2Bg4GPrY4zZIkSZKk6bTWPVFVdVOSPwH+H7Ah8DdV9c1Fa5kkSZIkTaF1Gc5HVX0C+MQitWXWOMRxfNaqj/Uan7Uan7XqY73GZ63GZ636WK/xWatOaz2xhCRJkiTNonU5J0qSJEmSZo4hSpIkSZI6GKKWUJJMug3rC2s1PmslTZ7bYR/rNT5rNT5r1cd6LS5D1BJIshFAecLZgpJsBtZqHNZqfEn2S/L6JMcl2XJum5TWldthH38Px2etxud22Md6LQ0nllhkSR4PPBa4PfCXwCVVddVkWzWdhlodChTwd8DZVXXBZFs1nazV+JI8APgo8GzgQNq2eCJwWlX9eJJtm0ZJlgMbVdXFI4/FH9vf5HbYx9/D8Vmr8bkd9rFeS8cQtYiS3Id23aynAPsAmwHfBf65qn44ybZNmyS7AKcBTwJ2A7ai1euvq+o7k2zbtLFWfZIcAjy6qp4+3D+KVrePAZ8EbjYgNEkOAl4CBPgU8Omq+o9hmUFqhNthH38Px2etxud22Md6LS2H8y2urYDPVdW/V9VLaNfQ2g54QpI7TbZpU2cT4PNV9YWqeitwMnA5cFSS7SbbtKljrfp8EbhrkgcDVNW7ga/SdlDuYDBoktwZeB7wTGB/WpA6IMn/Bod9rILbYR9/D8dnrcbndtjHei0hQ9TiOhPYPskfAlTVvwKfA+4DbDnJhk2hbwN3S/IsgKr6Kq2X4CZgF/AEyBHWqs8VtNC0Z5LfBqiqdwI3Ai+eZMOmzIa0H9ifVdWPgGOBC4DfT/KgibZsOrkd9vH3cHzWanxuh32s1xIyRC2SJBsM51ucADw4ycMBqupjwM9oR3wFJNmwqn4BvALYI8mTAKrqTOAq4LDh/swfCbdWazb35T/6I1BV19GOtt0deHyShw2Lvgz89FZv5JSqqstodToiyTZVdSXwj8Pix02uZdPH7bCPv4fjs1bjcztc2OhvofVaeoaodTT3ga2qW4aH/g24Bthv7gMLnAX8LIn1Bqrq5uHml4HPAPsmec7w2A+BDZJsMpHGTRlrtaCthr/LoP1oAFTV14B3AZsCr0pyIvAi4JRJNHKKnTb8PXgIUlcAbwUekWSmj4DPC+Zuh2MaQoG/h2sw9z0F7jusyfweErfDsWwCv9wOrdcSc2KJtTTMAPa1kS/A0WU7AQ8HngVcBOwO7FtV/3XrtnI6JNmGNsRqo6r6yeiPbJLNgd8D3gT8N/BA4PFVddbEGjxBazqZ31r9uiT7Ai8HzqZ9vo6vqvOTLGOYPCLJ7YEtgPsDX3dGomY4QnnzcHtf2vfVJsDxwL2A5wL7VNX1k2vl5CTZA1hWVV9YxTK3w3mS7Eo7p+6qqvrRvO94fw9HJHkosBNwYlXdOG+ZtRqRZHvgEmCTYd9h9HvL7XCeJHsDLwUOqapL5y2zXkvAELUWktyVdu7APwFPH7pL55b9cic4bV7++wAXjE4fPEuS7AO8ijYu90bgVcP5F/NrtQxYAVw7DDOaOcPO7D1pYeD6ecus1YgkdwdOBQ4HbgEeBjweOKyqzp37sU2y5dC7MtOSPJI2vJGqes/w2Ohn6v7AI2k1/AXw4mHs/MwZdkTeChw6DHuZe3xu1IHb4YihXm8DPgv8Dm3H7PK53pORMOXvYfI4Wg/54bRLLszVxlrNM+w7vJbWg7ID8Oqq+u6w3d0yUiu3Q37te+uHwGur6vS5gxnuPywdQ9RaGBL9B2kb9tm0Hbcbh2UZjoBvV1UXTbKdk5bkEcC7gafTxnY/EfhGVX1g3pHKu1XV9yfY1IlL8kDgdNpRt78E/nFVvQDWqkmbseqYqvrjkSEfLwb+EDi4qn6Q5HeAg4BjaJMnzOSXXZLH0o4+vh/YD3h3VX1wWLbx6NHwYQjfDVX1k0m0ddKS7Ek7OHZYVZ2WZLOquj7J7arqhpFw7nYIJLkn8GHgT4Z6vQV4HcNnKMmyqrpp1n8Ph++ozWg9vX9bVZ9O8lu0A0AbVdU11upXkuxMuyTFkcDXgBcCzwAeNQQpt8MRQ4B6A20EwYOAh1XV/qtYz3otspkcZ7uuqupq2gb+WNoQhuOTPDTJA4cAdQ/glUnuMH9M74zZjXZE5PNVdQZwJfBQaEfc0uwAHJvkjjNeq81oAeBJwCHAyuFoJNCOVForSHLvtBOv7wLcP8mLagC8mTY18FOH1X8CHFdVN8xwgLoD7cT0l1bVm4EPDY/vBjBy8GfPJJtU1RWzGqAGvwt8AbgyyY607/Z3AX+XZOdhx21HZnw7HPFz4LNDgFpBO1H9L4EvJLnPEAruyYz/Hg5fUdcBlwHXDcPUPknrOfhckntbq19zI21a7v8YDiaeDFwMfCLJ3YftcHvcDud6MR8HPL/aNf7eAtwxydPnrbcd1mvR2RM1piEY3Qn45nBE8pW0o9tvSvIl2vjSA6rqlCR3BDaoqmsm2eZJGWq1jDb7y+2q6gfD43vQNvSDh/ubVtXPkvzW8AMzc4Za3R74Ae0cjCuHOr2BdoT3/cOR8NtX1U9nvFaPBd4InE+7zsU/AMcB76yqtw/r7A0cWFXPmlQ7p8kQot4B/DNtmMdHgC8B2wOXVNUTkmxK68X7m5rRC3sO2+HGtO+sg2nDavenBYIvAo8C9gIOqqrrZnk7hF/W6460UPBh2hTdB9KmyX8z7Yj4c2gH0n7G0NsymdZOVpJ70w76fBv4E+AG4FrgF1X110mePzy+O9Zq7nN1BfB52nfXO4GXAefRpnvfAHj9cMB61rfDnYebl1bVtSPD954NbFNVr5g3lG+m67UUlk26AeuDJPsBr6f1pFyW5BW0nZK9h6Mhy2k7JiuTfLKqrp1caydrpFZX084bO2Zk8S20E2pJ8hTgt5P8OTCrJ6/P1eoq2g7uMcCVVfWlJC8bll0xDLHaI8nhzG6t9qIdtT2sqr6c5BTgOlqv00nD0bjjgK2BXYahMtfPcA/ULlX13WFI1deBpwE7AidVu5gnSb6c5OCqOjHJG2veSe6zYmQ7vAb4Dm3Y4820iYOOH9a5GNiZ1vPCLO+IzKvXV4GjgXNptXnbMEz72CT3AzatNnX+TBo58PPftJ7xdwDvo4WlQwCq6tgk96EdRJvZXuB5n6uv0HrQX0C7ltFdgFfSDmTsM/e97nb4y/2HS5O8tqrOHhZ/HvhUks9X1afmnjPL9VoqDudbQJIH046srayqR9CC1ItpAeFPaUeX/riqfp8WEu46qbZO2rxaPZy2k/vCkVWuBs5L8kTaF+TfV9XNs7ijO69WewE/Bp4/LEtVfRFYSTuy+1LgTVV10yzWanApcNQQoO4KPID2o3oQcBJth+TdtFo9r6qum9VaDT+uX0+b1p2qegvts/R24NMjq55O+85ihgPU6Hb4MFp4OriqjqNdt2fOw4G70XqNZ9Yq6nU72gQcF9J++547rHcocD/aBT1n0siBn2dU1QHAb9G+5/emzRh6QJKtkxxGG8kyk99XsMrP1R2B3atqT+Ao2mQlv6AdCNoiySazPCRtFfsPV9IC59z+w3/Rfh8PTXLniTV0BjicbwHDh3WXqnr/cH85bdjL/mkz7VxfVZ+bZBunxWpq9R7aTsnPkvwv4Hu0APrUqvrmxBo7YWuo1R/RhnncMqzzCeAhs1yr+ZK8nPbd9bokz6TNCHYccCGwWc3wbHzD8L2TaUOsHgxsXFVPHpatpM12dRBtB/dZwB9V1Xcn1NyJW812+D7asL2588WOoA1Ne/Ksb4erq1dVPT7JLrRgfhpwX+BJVfWtSbV10tKmfb/rcL7YXWkTJHyV1suyAS2UXwQ8hHYgdmY/W2vYDp807Dsso50vfAzwuJEel5k05v7Dw2izQD7HHqilY4haQNpF8e4wjDfdkDZc6BTg0VV1xXD+0w01Ms35rFpDrR5TbcrbnYH30noUvj3Jtk7aGLXahnZexgVVdd4k2zrtknwKeHmNTEc9y4bPzrW0Cw2/C/h5VR06LHslsCvtSO/R7owsuB3ejXa+yvGz/p0Fa6zXvlV18XD+z43Aj8vpk39pFQd+7g+8sdp17TavNlnVzBpjO9yBdhmL//T3cKx63bna+dUz/9laap4TtYBqF3abO8cptPG6Vw8B6lBgT+BFtGurzLTV1OqqYaN+Ku2cggNm9aTZUQvU6lBgD+DPZnmM/KqMniQ73H8C7ZzEmZwQYVXqV9eVuT7JUbQZ5k6sNqHLP9AC1LdmdQjfqAW2w8NovZyvnuXzXEetoV4XD9/xewAvqqobJtXGaVRVfzFy+z1JPkmbJOF8Wg1n2hjb4QOAV9aMXvx7vjH2Hx6e5LkGqKVniOpQVTfRdkwuSPIG4DHA09zR/U0jtbpwpFaHG6B+0xpq5edqnrkAlWQT2nTKL6ANSfvRRBs2pYajkUcBb0ryXdoP7iMMUL9pDduhAWoVVlOvpxmgft1qDvxsRRvKx+gyrXE7NECtgtvhZDmcr8NwIuNGwDnD30dV1bmTbdV0slbjs1b9kmwEPBo4r6q+M+n2TLu0aZRfShuG/I1Jt2cauR32sV59VnHgZ6aH0q6On6s+1muyDFFrIcnTgK/M8omg47JW47NWWgpJNqfNYPjCYdYmrYHbYR/rNR4P/PTxc9XHek2GIWotzO+e1+pZq/FZKy2VDBe2nnQ71gduh32sl5aCn6s+1msyDFGSJEmS1MGL7UqSJElSB0OUJEmSJHUwREmSJElSB0OUJEmSJHUwREmSJElSB0OUJEmSJHX4H2t0tL4Qpg4WAAAAAElFTkSuQmCC\n",
      "text/plain": [
       "<Figure size 864x360 with 1 Axes>"
      ]
     },
     "metadata": {
      "needs_background": "light"
     },
     "output_type": "display_data"
    }
   ],
   "source": [
    "df.plot.frequency(\"age\")\n",
    "df.plot.frequency(\"age\", output_format=\"image\", output_path=\"images/frequency.png\")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 34,
   "metadata": {},
   "outputs": [
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "INFO:optimus:Using 'column_exp' to process column 'fare' with function _bucketizer\n",
      "INFO:optimus:bucketizer() executed in 0.29 sec\n",
      "INFO:optimus:Using 'column_exp' to process column 'age' with function _bucketizer\n",
      "INFO:optimus:bucketizer() executed in 0.38 sec\n",
      "INFO:optimus:Using 'column_exp' to process column 'fare' with function _bucketizer\n",
      "INFO:optimus:bucketizer() executed in 0.29 sec\n",
      "INFO:optimus:Using 'column_exp' to process column 'age' with function _bucketizer\n",
      "INFO:optimus:bucketizer() executed in 0.3 sec\n"
     ]
    },
    {
     "data": {
      "text/html": [
       "<img src='images/scatter.png'>"
      ],
      "text/plain": [
       "<IPython.core.display.HTML object>"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "data": {
      "image/png": "iVBORw0KGgoAAAANSUhEUgAAA18AAAEICAYAAACgUk4PAAAABHNCSVQICAgIfAhkiAAAAAlwSFlzAAALEgAACxIB0t1+/AAAADl0RVh0U29mdHdhcmUAbWF0cGxvdGxpYiB2ZXJzaW9uIDMuMC4zLCBodHRwOi8vbWF0cGxvdGxpYi5vcmcvnQurowAAIABJREFUeJzs3XecXVd57//POn16L5qiLqtYtmVLrjK4gwFTQksgJA448eWGhBDu/QWS3F+S+8rN/ZFGSy4QB0MghP4D2wFsbMuW3Itky5Jl9ZE0vZ8pZ86cutf9Y0bGRZam7H3OmZnv+/Xya2bOnL3WM83az1lrPY+x1iIiIiIiIiLe8uU7ABERERERkaVAyZeIiIiIiEgOKPkSERERERHJASVfIiIiIiIiOaDkS0REREREJAeUfImIiIiIiOSAki8REREREZEcUPIlIiIiIiKSA0q+REREREREciCQ7wBmora21q5cuTLfYYiIiIiIiLzOnj17Bq21ded63oJIvlauXMnu3bvzHYaIiIiIiMjrGGNOzeR52nYoIiIiIiKSA0q+REREREREckDJl4iIiIiISA4o+RIREREREckBJV8iIiIiIiI54GnyZYz5Y2PMAWPMi8aY7xljIsaYVcaYp40xR40xPzDGhLyMQUREREREpBB4lnwZY5qBTwLbrLWbAT/wG8DfAl+w1q4DosBtXsWwmE2mshztG8dxbL5DERERERGRGfB622EAKDLGBIBioAe4Hvjx9Oe/BbzH4xgWpUeODvD5B47QNhjLdygiIiIiIjIDnjVZttZ2GWP+AWgHJoH7gT3AiLU2M/20TqD5TNcbY24HbgdYvny5V2EuWBe3VmIMtFQV5zsUERERERGZAS+3HVYB7wZWAU1ACfC2Mzz1jPvmrLV3WGu3WWu31dXVeRXmglVfHuEtmxqJBP35DkVERERERGbAy22HNwInrLUD1to08BPgKqByehsiQAvQ7WEMIiIiIiIiBcHL5KsduMIYU2yMMcANwEvAw8D7p59zK3C3hzGIiIiIiIgUBM+SL2vt00wV1ngO2D891x3AZ4BPG2OOATXAnV7FICIiIiIiUig8K7gBYK39S+AvX/NwG3CZl/OKiIiIiIgUGq9LzYuIiIiIiAhKvkRERERERHJCyZcsSh3DcTqG4/kOQ0RERETkZUq+ZFG649E2vrbrONaesY2ciIiIiEjOeVpwQyRfPnRpK9bCVJcDEREREZH8U/Ili9Kmpop8hyAiIiIi8iradigiIiIiIpIDSr7mqH88wUvdYzpTJCIiIiIiM6Lka45+9kI3X9t1jOGJVL5DERERERGRBUBnvubo5s3L2NJaSXVJKN+hiIiIiIjIAqDka46aKotoqizKdxgiIiIiIrJAaNuhiIiIiIhIDij5EhERERERyQElXyIiIiIiIjmg5GuJG51ME09l8h2GiIiIiMii51nyZYxZb4zZ+4r/xowxnzLGVBtjHjDGHJ1+W+VVDHJ2Q7Ekf3ffIT7/wBFSGSff4YiIiIiILGqeJV/W2sPW2i3W2i3AViAO/BT4LLDDWrsO2DH9seSBMWbqbZ7jEBERERFZCnJVav4G4Li19pQx5t3AtdOPfwvYCXwmR3G4aiSe4t79vVzUWsGmpop8hzNr1SUhPvu2DQR8PkKBwt6BOhhLcmIgxsXLqwj4CztWEREREZEzydVd7G8A35t+v8Fa2wMw/bb+TBcYY243xuw2xuweGBjIUZiz81L3GPcd6OEX+3vzHcqclUWCFIX8+Q7jnB49OsA3Hj9J18hkvkMREREREZkTY631dgJjQkA3cL61ts8YM2KtrXzF56PW2rOe+9q2bZvdvXu3p3HOxehkmocO9nFBcwVrG8ryHc6iFp1I0T4cZ3NzBX6fNkqKiIiISOEwxuyx1m471/Nyse3wbcBz1tq+6Y/7jDHLrLU9xphlQH8OYvBERVGQX7ukJd9hLAlVJSGqSkL5DkNEREREZM5yse3wQ/xqyyHAPcCt0+/fCtydgxhERERERETyytPkyxhTDNwE/OQVD38OuMkYc3T6c5/zMgYREREREZFC4Om2Q2ttHKh5zWNDTFU/FBERERERWTJUs1tERF7HcSxZx9uCTCIiIkuNki8REXmdbz5xgr+77xCZrJPvUERERBaNXDVZFhGRBaSyKETWsfiMWjuIiIi4RcnXPAzGktyzt5tLlleyZflZW5WJiCwo79uqNhoiIiJuU/I1D0f7xtl1ZIBoPKXkS0REREREzkrJ1zxc0FLJhy9fzobGMs/nOn3uIuBf+Mf0HMeScSyhwML/WkREREREZkp3v/NQGg7w1vMbWVFT4uk81lq+8OAR/vnhY57Okyvff7ad//Xzl5hMZfMdioiIiIhIzmjla4Foqigi4F8cB9/rysKMJTL4fYvj6xERERERmQklXwuAMYbfvGJFvsNwzU2bGrkp30GIiLzC3o4RHMdyyQqd3xUREe8o+RIRkSXv7r1dJNMOFy+vxKi8voiIeETJlwvaBmLEkhnOb6pwfSvdoZ4xnmwb4tIV1WxuqXB17MUqk3V4sm0Ix4Hta2sWRZESkVxLZx0cawkH/PkOJSc+fs0aso5V4iUiIp5S8jVPwxMpvrrrOMm0w21Xr+Ki1krXxk5lHL75+EkCfsPBnjH+6l3nUxzSj+xcDnSP8aPdHYChrCjAJWoDIDJrdz56gv5Ygj9728Yl8QJGQ3kk3yGIiMgSoDv5eQoFfJSEAkCG0oi7306/z1AWCTA0kaKiKEjAt/hvgNxQEvYT8vuwTFWkFJHZa6kuoiTsx6eVIBEREdcYa22+Yzinbdu22d27d+c7jDcUS2ZIZRyqS0Kujz08keL4QIyVNSXUlYVdH3+x6hiOA9BaXZznSERERERksTPG7LHWbjvX87Qs4ILScAA8youqS0JUl1R7M/gipqRLRERERAqN9rEtUId7x7jr+S76xxP5DkVEZMHrHU3QPTKZ7zBERGSR8zT5MsZUGmN+bIw5ZIw5aIy50hhTbYx5wBhzdPqtqiHM0mQqy52PneCRowP88NmOfIcjIrKgjSfSfPHBI3xpx1GiE6l8hyMiIouY1ytfXwLus9ZuAC4CDgKfBXZYa9cBO6Y/XtAS6SzjiXTO5gv4DZXFIaxF58BewVqL4xT+GUYRKSyhgI/a0hDVxUEiwaVRWl9ERPLDs4Ibxphy4AVgtX3FJMaYw8C11toeY8wyYKe1dv3Zxirkghsj8RRf2nGUiWSGj21fxYZl5TmZdzyRpm8syYqaYoIzLAM9Gk+z60g/29fWUlO6uJK2yVSWr+48zngyze9fu1ZJqYjMyukXbnwu92oUEZGlYaYFN7xc+VoNDADfNMY8b4z5ujGmBGiw1vYATL+tP9PFxpjbjTG7jTG7BwYGPAxzfvrHk0QnUqQylrbBiZzNWxYJsra+dMaJF8DJoQl+sb+H4wMxDyPLj6GJJO3DEwyMJ3VuQ0RmzeczSrxERMRzXq58bQOeArZba582xnwJGAP+0Fpb+YrnRa21Zz33VcgrX+msw917uxiJp3nvJS2elJt3S9axtA/HaakqmlXS9lqJdBbH2oJq+Ow4lkeODhBPZbhhYwPhgLYOiYiIiEhuFEKp+U6g01r79PTHP2bqfFefMWbZK7Yd9nsYg+eCfh/v39qa7zBmxO8zrKotmfc4//poGwPjSf7ilk0E5pHEucnnM1y7/oyLqCIiIiIiBcGzO2drbS/QYYw5fZ7rBuAl4B7g1unHbgXu9ioG8cbGxjIubKnAry06MkeZrMNCaPAuIiIi4iav9439IfAfxpgQ0AZ8lKmE74fGmNuAduADHsewaOztGOG+F3t4/yUtrG0oy1scN25qzNvcS0kyk+W+F3vZ3FTOmvr8/bzd1j4U52u7jrOsMsLHr1kzry2wIiIisrQ8c2KI8USG6zfUY8zCWwjwNPmy1u4FzrT38QYv512s9neOcKh3nMN943lNviQ3hidSPHyon3TWWVzJ13Cc0ck0yUyWeDJLRbGSLxEREZmZR44MMjSR5Jrz6gj4lXwtadZa9raPkMo6bFtZ7fq2vHde1MT6xnLOb8pNOXvJr2UVRfzJzRsKuojLXGxdUcXoZIqG8ggVxUFXx45OpCiNBLSaJiIiskj93ptWk3acgqk7MFtKvlzUGZ3k20+dImst5UVBNrrc86uyOMRlq6pdHVMKW1NlUb5DcF1RyM87LmxyfdzBWJLP3XuIK1ZXL5giOCIiIjI7br9wm2tKvlxUXhSksjhIKuMsutUKkUJXGg6wpbWS9Q1aGRYREZHC5FmfLzcVcp+v10qks1g79er+QvH4sQEePjzAb12xghU18y9FLyIiIiKylMy0z9fC3CxZwCJBvyeJVzyVIZ7KuD4uwIvdY7QNTHByaMKT8UVERERERNsOXWWt5dmTUVKZLFeuqXWt4Mbz7VG+90wHxsBvXracC1srXRn3tA9ubeXSldVscuGM2vPtUfZ3jnLdhnpaq4tdiG7uDvWM8dSJIa5eU+tadcjRyTQBn6EkvHT+dDJZh+F4ivqySL5DEREREVnQtPLlos7oJN97pp0f7u7kcO+4a+P+fF8PFUUBSsMB7n2x17VxT6sqCXHJ8ioiwfmt2E2msnz36Xb2d43yo90dLkU3N9Za/u2JkxzqGec7T7e7MmYm6/D3vzzE13Ydd2W8heKhQ/38758fpH0onu9QRERERBY0JV8uqigOUl0SoqIoSG2ZewU3GisiDMVSDE8kaawo3NWHUMDHsooistaysjb/Z8dW1ZaQzGRZWevOCpzfZ7hqTS3bVla5Mt5Csba+lEtXVlNdqiIyIiIiIvOhghsuS2UcLJZwwL1zX7FkhkeO9OMzPt60rragt7wl0lmGJ6Z6OLnd52y2UhmHwViS+rLwgu0FISIiIiKFb6YFNwr3Ln6BCgXcv8kvDQd4+wXu90XyQiToL5jeVKGAr2BiERERERHRcoCIiIiIiEgOKPkSERERERHJASVf82St5aXuMfacGiaZyeZkzr6xBMMTqZzMtZAlM1nah+KkMk6+QxERERER0Zmv+drbMcK3njiJtXDV2hp+/dLlns53oGuUbz5+gmDAxx/dcF7BVT/cebifB17q46NXrXStt9ZcWGv5+qMnONo3zoUtFXzs6tWujHn33m7KIgFu2NjgQpQiIiIispRo5WueohMpDFAWCdA7mvB8vv5YklTWMpnKEo0X3urX8YEYA+NJunPwvTiXrmicUMBHR3TSlfEcC8+1R9nXOerKeCIiIiKytGjla562rarmcN84o5Np3nNxs+fzXb6qmtF4mpKwn/NmubLkOJaukUmWVUQ8K73+gW2tXL6qZtaxuc0Yw0e3r+KZk8NctabWlTH9PsOf3LwBv8lvCX0RERERWZg87fNljDkJjANZIGOt3WaMqQZ+AKwETgIftNZGzzbOQurzVcj2dY7w1Z3H+dj2lVy6qibf4YiIiIiILAoz7fOVi22H11lrt7wimM8CO6y164Ad0x8vGAPjSX6+r5tnTgzjOLlpUJ3OOuw+Ocyek8NksnMvHtFaVcyNG+tZWVvqYnQzk8o4jMbTOZ93obDWshAansvCo98rERGRwpGPbYfvBq6dfv9bwE7gM3mIY9Ycx3LHI8cZmUyTzjiEA4aLWqs8n/ehg/38fH8PBnjXZHrOxR6qSkK8b2uru8HNQDrr8OUdR+kdTfDRq1dyflNFzmMoZKPxNF9+6CgNZWF+902r8fnyv63x1NAEvzzQx/suaaamNJzvcGSOnjg2yF17u/jw5SvY0lqZ73BcseNgH0OxFO/f2lIQfysiIiKz4fXKlwXuN8bsMcbcPv1Yg7W2B2D6bf2ZLjTG3G6M2W2M2T0wMOBxmDOTtZaReJqakhDGwOhkJifzDk0kiQR9BAOG6AIsMZ/JWvrHE0xmsgyMJ/MdTsEZS6SJxlO0R+NkC2SVomckwUvdowVZ1EVmrmtkktHJDD0j7hSdKQSHe8fZ3zVCJkc7D0RERNzk9ZmvJmtttzGmHngA+EPgHmtt5SueE7XWnnX5qJDOfD3VNsTde7torizi1qtWUhYJej5n/3iC7z/dgc8HH7ps+YJciWgbiNE3muCSlVWEA35P5ohOpEhlHRrK515+31rLqaE4jRURIkFv4jzTnCeH4pSE/NTPI3Y3WWsZnUxTWRzKdygyD5OpLCeHJlhdV+LZ312uJTNZMllLSVj1okREpHDM9MyXp8nXqyYy5q+AGPB7wLXW2h5jzDJgp7V2/dmuLaTkSwrXP95/mP7xJH/97s2EAnNb1D3WH+MLDxzm3Vuaecv5jS5HKCIiIiKLUd4LbhhjSowxZaffB94CvAjcA9w6/bRbgbu9ikHOLpHO8vixQXafHCa7CLbwXL++nrdtbiTon/s5kKbKCDdvXsYFLTqXJiIiIiLu8nLfRgPwUzPVEykAfNdae58x5lngh8aY24B24AMexiBn8bN93Tx6dBDsVDXCq9a60w8rXy5eMf/iJ8WhAO+8qMmFaN5YZzTOXXu7eN8lLSyrKPJ0LhE5u/6xBD/c08EtFzSxsrYk3+GIiMgi51nyZa1tAy46w+NDwA1ezZsvE8kMXSOT1JWGqSrx/pxMKuMQS2aoLArOueJXNJ4iEvCTzjqMTnpXBj6VcQj4jCqTTYtOpDneHyM6kVbyJZJnY4k0bf0TDMaSSr5ERMRzOTvzNR+FfuZrPJHmyzuOMhJPE/T7+MT1a2mu9O6meiyR5isPH2MwluTC5ko+csWKOSU2vaMJfryng+KQn/dva6Xcg+Ihjx8b5K7nu6gtC/Pxa9ZQUeRNgZLReJpvPH6CeCrDrVetpKWq2JN53GCtJZbMUBoOML0yLLIgdQzHKYsEFnxhlvFEWn+PIiIyL3k/87WUnBqKE51I01RZRCrj8FL3qKfztQ1MMDCepKmiiH1dcy8H3lgR4Q+uX8fHrl7tSeIFcP+BXqpKQvSOJjjaN+7JHAAv9YxyamiCsckMjx0d9GweNxhjKIsEdaMnC1osmeFLDx7hu8+05zuUedPfo4iI5Ipq9bqgriyMzze1kpS1liaPt5I1lIcJ+H10Tm9zLI0U7o9xQ2M5z54cJhTw0VjhXRn1lqpiIkE/GceyvrHMs3lEZEpJyM8tFzXRUFYY7RFEREQWAm07dEnbQIwD3WMsry7mwpYKz19F7R6ZpG8sweq6Us+28rkhk3U4OTRBeVGQeo9v0kbiKdJZS13ZwuuDJiIiIiIL10y3HRbukskCs7qulNV1pTmbr6myiCYPz5W5JeD3sbY+NytRC/3ciYiIiIgsbjrzJUvKqaEJhmLJfIexYLQPxRlcIN8vay3H+mOeVu6UxSeTdTjcO04inc13KCIisgQo+XLJWCLNsf6Y6zeqjmMZT6QXRRPkfJtIZvjyjqN8bxEUCMiFyVSWf3roKN958lS+Q5mRvrEkX95xhJ/v6853KLKAHOod5/MPHObZk8P5DkVERJYAbTt0QcdwnH/ZdZxU1gHgI5ev4MLWynmPOzqZ5uuPttE7mqCuLMzvvWm1az3ErLUk0g5FIb8r453JycEJekYn2dxcQZlH1RRnozjk572XtFCvM2EzEgn6eO8lzVTnoG+dG2pLQ/zaxS2src/d9l9Z+FbVlvCBra2c31SR71BERGQJ0MqXC+7d3wPAsooiKotD/GhPJ24UMtl1ZICe0QRNlUUMxlI8eLBv3mOe9tChfv7sp/t55sSQa2O+0sB4kq/sPMb3n+3gu08XxkqTMYbta2tZ13DuM2j37O3izsfalvSKozGGK9fUsr6xPN+hzEjA7+O6DfW0VhdujzcpPCXhADdualgwLzKIiMjCpuTLBcmMQ8A/9a0M+Azp6RWweY+bzuKfbp4c8BkSaXfGhalVtVTGYSyRcW3MV0plHbKOJeT3Me7RHF4aiCXpGUngLIBqoCIiIiKyMGjboQuu31DPNx4/wXgijWMtb79gmSul5q9eV8u+zlF6RicJ+n1cu77OhWin3HJhE9tWVNNc5U3FxKaKCB/Y1srJwQmuXV/vyRxe+p2rVuFYS9Cv1ydkYeqMxoGpHngiIiJSGGbV58sYU2KtnfAwnjNaCH2+2ofidEbjVJeGWN9Q5lqfr/FEmv7xJLWl4YLu5yUiheXPf7ofn4G/fs8F+Q5FRERk0XO1z5cx5irg60ApsNwYcxHwX6y1vz+/MBeP5TXFLK9x9xXmrGPZcbCfFzpGOL+5nPdsaX55e+N8WWuJJTOUhgNzShTTWYfukUkaKyKEA94V7RCRufmNS1sxeNvsXURERGZnpnfyXwDeCgwBWGtfAN7sVVAy5WDPGLuO9BMO+nj06CAvdo+5NvYDL/XxF3cf4Km2uRXc+OlzXXz+/iMFU0xDRF7tgpZKNrd4V8EvlXH454eO8vPpgkMiIiJybjM+82Wt7XjNCok6Uk5LpLP89LkuDvWO0VRVxAe3trpSEj7rWKyFkN+HMVPNQN0ST2XIOA7x1Nx+jJPpLBnHMqnGpCJLkmMtwxMpakrU1FpERGSmZpp8dUxvPbTGmBDwSeCgd2EtLPfs7ebZU8M0lEVoG5jgm0+c4I9vPG/e5742LivngpYKXuoeZ2NjOZub3XsV+x0XNrFtZTXLKuZWcOMD21q4eHklq2pLXItJCs+x/nFKw0EaKyL5DkUKTCTo58/evvHliqwiIiJybjNNvj4OfAloBjqB+4FPzORCY4wf2A10WWtvMcasAr4PVAPPAb9lrU3NNvBCcqR/nLrSMKGAj8byCJ3ROMmMQyQ4v7NQoYCP265eTTrrEPAZ14p4AAT9vnlVQSsOBbiwZf6NpKVwTaayfHXncVqrivnUTeflOxwpQG6dQRUREVkqZvQvp7V20Fr7m9baBmttvbX2I9bamR4W+iNevUr2t8AXrLXrgChw2+xCLjwrqouJTqSw1jIUm6pMGA64d1MS9PtcTbxeK+tYRuPaOiSvFgn6+MgVK3j3xc35DkVERERkUZhptcMvn+HhUWC3tfbus1zXArwD+Bvg02Yqg7ge+PD0U74F/BXw1VnEnHfprMOJwQlSGYemyiLec3EzE6kMx/pi1JdH+O0rV3qaLLntp8938tjRQX7nqpVsWV6V73CkQGQcy6NHB2muKtL2UhEREREXzHTbYQTYAPxo+uP3AQeA24wx11lrP/UG130R+BOgbPrjGmDEWpuZ/riTqa2Mr2OMuR24HWD58uUzDNN7iXSWrz/axonBCXzG4PcZPrZ9JR+/Zi1Zx7p6/iHrWHYd6edIX4w1dSVct77ek20+qYyDYyHtzLznWyHKZKe+jpCLq45LmbUwkcyQVFEVEREREVfMNPlaC1x/OmkyxnyVqXNfNwH7z3SBMeYWoN9au8cYc+3ph8/w1DPe8Vtr7wDugKkmyzOM03OPHBngxOAEzZVFGGOYSGb496fa+ct3bnI9Mdp1pJ979nZTWRziUM8YjgNv3dzo6hwAH9jWyg0bG6gvC897rLFEmuP9MTY3VxDM4XmQ/vEEX915nEQ6y23bV7G2oezcFxWYY33j7Osa5YaNDQXRUDsU8PGZmzeQi0Xc59ujdEYnuWlTw7zPSsqUE4MTAFq1FBERKSAzvTtuBl75L3gJ0GStzQLJN7hmO/AuY8xJpgpsXM/USlilMeZ00tcCdM826HzqGpmk5BWNiUvCASbTWSaS7q8OHOmLUVkcoqIoSHVJiMN9467PAVNnyhrKI65slXzq+BBf3XmctoEJFyKbuWN9MUYm0mDh2ZPDOZ3bLXft7eau57t4sXM036G8zOdyoZczsdbyg2c7+M8Xujk+EPN0rqXk64+2cedjbfkOQ0RERF5hpitffwfsNcbsZGr16s3A/zbGlAAPnukCa+2fAn8KML3y9d+ttb9pjPkR8H6mErJbgTc8M1aIGssjHOga43ShwHgqQzjgozjs/qv1a+tLOdQzhs9ANJ5i64rCP4912apqqkpCOX+1fU19KeVFARKZLJcsgO/Tmbz9gkZaqoo4v7k836HklDGG917STPtwnNW1pfkOZ9H4natW5jsEEREReQ1j7cx29BljmoDfAg4xtfLVaa19ZIbXXstU8nWLMWY1vyo1/zzwEWvtG62eAVPbDnfv3j2jOL0WT2X4l11tdETj+DD4fFM3OZua3OvBdVom67DjYD+H+8ZZU1fCTZsadZ7pLJKZLI4DRaH5J8Lfe7qdvvEEf3DdWpXTFjmDZCbLlx88ytqGUn7t4pZ8hyMiIpJXxpg91tpt53reTKsd/i5TJeNbgL3AFcCTTG0lPCdr7U5g5/T7bcBlM7muEBWHAvz+dWs41h8jlXForiyivtybBrQBv4+3bm705JzXYhQOuLf66FhLdoEXIBHxmmMtjpPvKERERBaOGa18GWP2A5cCT1lrtxhjNgD/01r7614HCIW18nVa1pm6OddK1OJlrV1QLQNEck1/IyIiIlNcXfkCEtbahDEGY0zYWnvIGLN+njEuSNZadh0Z4MGDfUymsmxaVs77trZQWRxydZ7JVJaXekZxLKxvLKM8kv/qd0uNbipFzk5/IyIiIrMz0+Sr0xhTCdwFPGCMibLAqhS65fn2Ee56vovGighVxSGO9MX41hMn+eQN61y7EUmks3xl5zE6o5MYA1XFIT55wzrXyo+PJdIYoMzFhC6RzvLo0QEiQT9Xrq7J+TmprGN54vggiXSWq9fWzfvc10QyQyZrqShefEnveCLNi12jtFYX03K6cozIEqYVPBGRhaNrZJJ0xmFFTfGC/H/3jJIva+2vTb/7V8aYh4EK4D7Poipgjx8fpKo49PL5oobyMB3RSfrHkzS4dPbrWH+M7pFJlldP3Rh3DMc50DXKVWtr5z32eCLN3993GGPgT27eQGl4pvn32e042Mf9L/VhLRQH/WxdWe3KuDP1QkeUH+3uBCCRdnjnRU1zHiuVcfj8A0eYTGX59FvOo7Z0/v3PCsmPdneytyNKWSTI/3vLJvXVkiWtfSjOV3Ye4yOXr2Bzi/uFk0RExD394wm+/OBRMo7lv167hrX1C69K8qzvvK21u7wIZKHwG3PGrtBu5t2vT+Kta+MbY/D7DAbwuRi0zxhOHx/Mx6sQPp/v5e/bfBfdjGHqe2Smvq43UrIVAAAgAElEQVTFxu8zgHH15y+yUBkDkaAfv16DEBEpeD4zdX92+l5tIZpxqfl8KqSCGy90RPnm4ydpKI8Q9PvoG0uworaET1y7xrWkI5nJ8rVdxzk1FMcAtaVh/uD6ta5tE4ynMsBU5Ua3JDNZnm4bJhz0cemKanw5/oNwHMvuU1ES6SyXr66ed+XDRDpLxrGurQwWkngqw8GecVqqilxbrZXCdHqL6ZbWKldaMIiIiORb/1iCVNYpuKMTMy24oeRrlqy1PNk2xP0H+ogl05RFAiTSU7WWL1lexTsvanJlG1cineVI3ziOhXX1pZQswiRARLz15PFB7nzsBH94/Touaq3MdzgiIiKLlpIvj1k7tdLy3adOUV8eIeAzdI8muGxVFR+6bEW+wxMRIZ7KcKQvxsZlZa72wRMREZFXm2nypSZVc2SMYc/JKBXFISJBPwG/j5aqIp47NUImq66jIpJ/xaEAW1orlXiJiIgUCO1lmwdjwHnFyqFXBSf2d47ws309GGN410XL2NSkilwiIiIiIguNVr7m4ep1tcSSGcYm08RTGTpHJnnTulpXq6/0jyX49pOnyDiWdMbh3544yWAs6dr483Wge5QD3aP5DsMVOw/3872n25lMZfMdypKRdSxPHBukZ3Qy36EsSqOTaR45MvBykR0RERHJLyVf83B+UwUfvWol5UVBgn4f77igkbdfsMzVOYYmUgCUhgOURgJYIDr9WCH47tPtfP+Z9nyHMW+OY/nF/h52HR2gIxrPdzhLRv94gu8+087Ow/35DmVReql7lH9/8iRH+2L5DkVERETQtsN5u6ClkgtaKnEcS9vgBD/b14NjLRuXlbOuvpTAPJtONZZHCPgNQ7EkFgj5fdSXFU558I9fsyav848n0sRT2XmXTPf5DL91xQp6xxKsrClxKTo5l8byCJ+4bi3LKgrnd3oxuXh5FX90Y5DzGsryHYqIiIigaoeuSGcdvvd0Oy90jhDw+/AByazD2vpSPrZ91ZxKzzuOZSCWJOtYEuksDx3qx2cMN21qoLW6sPoa5IvjWP6/ew8Sjaf5oxvW6fsiIiIiInkx02qHWvlywZPHh3i+fYTW6qKXi21YazneH+P+l/p410VNsxovk3X4j6dPsa9rDAOsqi3hY9tXFXSTVGstndFJQgFfzhr3GgNlkSATqSzhgHbQioiIiEhh0x2rC3Yd6ae2LPSqKofGGBrKIzx5fIhUZnal559rj7K3Y4TmighNFRHaBmKen4npH0vwuXsP8s3HT8ypVP4jRwf44oNH+PtfHualHBXgMMbw+9eu4S9u2UR9jhI+Lz1/Ksq/Pnq8YAqqWGt5um2II71j+Q5lRkbjab7x2AmeODaY71BEREREzsiz5MsYEzHGPGOMecEYc8AY8z+nH19ljHnaGHPUGPMDY0zIqxhywVrLSDxN0Rm2Fgb9PjJZh2RmdtXzBmNJQn4/xhiMMZSEAvSPe3tDfnwgRvtwnH0dI4xMpmd9/YGuMUrCAXwGjvbn7nB/wO+b07bOQvTw4X4eOzrE4Z7xfIcCQCLt8KM9nfxif2++Q5mRtsEYjx8b5MGDffkORUREROSMvNx2mASut9bGjDFB4DFjzL3Ap4EvWGu/b4z5GnAb8FUP4/CUMYamyiLGkxnKI8FXfS6RzlIc8lMcmt23eXl1MamsQybrYIxhLJlhda23RSA2N1dw9dpa6krD1JTMPh++5rw6vvXkScIBP1tXVLkfYI4NT6RIZRwac1gI4gPbWrmgpYItyytzNufZFIX8/OH1a2f9+ztbifTUttH59sfbuKycD1++nBU1OvsnIiIihSknBTeMMcXAY8B/BX4ONFprM8aYK4G/sta+9WzXF3rBjefbo3z7iVM0VUZerm7oOFNnoN59cRPXrq+f1XjWWh482McDL/XhWLhqTQ3v3tLsav8wLyTSWXzGEFrg56+SmSx/8/ODJNJZ/ttb1ufsDNtSFJ1I8bf3HeLqdbXccuHszkaKiIiIFIqCKLhhjPEDe4C1wP8BjgMj1trTHT87geY3uPZ24HaA5cuXexnmvG1prWTwwiQ/29fD+GQaC5RGAly/vp43raub9XjGGG7a1Mh16+uxTG1fXAgWy/Y/vzFUFgUZM4ZIYHF8TYUqFPDRUl1MXWk436GIiIiIeM7T5MtamwW2GGMqgZ8CG8/0tDe49g7gDpha+fIsSBdYO1VuHjv1xdjTjzkWx1r8zG3F6pU9wk6vUM53a5acW8Dv449uPI+sYxf8Kl6hKwkH+IPr1uY7DBEREZGcyEmpeWvtiDFmJ3AFUGmMCUyvfrUA3bmIwUs7j/Rz/4E+WqqLCPimzmY5juWp44MEfYZfu6RlzmNnsg73vtjL49MV3N68ro63bm4s+C2IC53fZ/Q9FhERERFXeVntsG56xQtjTBFwI3AQeBh4//TTbgXu9iqGXEhmphogN1ZECPh+9e30+QxNlcU8cXyIscTsqweetuNgPzsO9lFTGqKmJMQDB3tdKTufSGc51h+bWrHz0HgizS/29/B8e9TTeV6rbyxB72gip3OKLCSpjMOeU8O0D8XzHUpBSGVmX5lWRERktrzcU7UMeNgYsw94FnjAWvsz4DPAp40xx4Aa4E4PY/DcYGyqKl74DGeD/D4DBnpG5p4E7GmPUl8+ldgF/D5qSsI83z4yn5AB+NkL3fzj/Yc97x/2yxd7+eWLvfzHU6foH8tNMjSWSPOFB47whQePMBqfe+Irspg91TbENx87yVd2HmMypaTj/zx8jM/ff4RcFKESEZGly7Nth9bafcDFZ3i8DbjMq3lzJTt9nivgMzjn+Ld6PtvXwgEfE8kMTBezyLh0Dqm2LExxyE+tx4UOKoqD+HwQCfkJ56ggR8jvo6okhLU6syXyRiqLgoSDPiqLgwT82mK7qrbE850AIiIiOSk1P1+FVGp+IpnhgYN9PH18iHTWsqKmmK6ROMWhAGVn6PM1kcrwF7ecP+ckYH/nCN984iTFQT92eszffdMqNi6rmPfXks46nldSdBxL2+AEVcVBanJY0S47nRHr3JY7so7loUN9rKgu5rzG8nyHIy4ZjCUpDQcWVKXSFzpGGJlM8eZ1dSpAJCIiBaMgSs0vNqmMwx2PtNE1Eqe+LELAZxiIJRmMpTAkaaospqJoKgGLJTNE42k+dFnrnBKvqaQlRijg57btK9nTPoIxcOXqGtbWl7ny9eSihL3PZ1hbX+r5PK+lpMtd44k09+7vZXNzuZKvRcTrlW8v7DjUR99ogitW15xxu7eIiEghU/I1Cy91j9IZjdNSVfzyY1XFIQxT5eUrioJ0Txd5qCsN89GrmrmwtXLW81hr+eHuDp45MQwG1jeUcdvVq15Ver5Q9Y4mODEYY3NzxetWAnMpkZ46w7KQXtEvZJXFIf74pvOoKD73z7RndJKsY1/1dyLilo9etYrkG5yzFRERKXRKvmbhcF/sjP/gn066PnvzBtKOxVpLRVFwzltixiYz7D45THNVEQY42h+jZzRBa3Xh38z+66Nt9IxMcs36On790vw1x/7yjqNkHMufvm2Dtia5ZCa/f7Fkhi89eJSsY/nM2zYsyJUVKWxVJaF8hyAiIjJnSr5moSjkJ3uGM3KOBZ+Zas5bHJ7/6lQoMFXZMJlxCExvn1soKzhNlRGGJ1LUl0XyGseGxjIcq6bUuRby+2goj5DMZCkOLYzfWREREZFcUfI1Cxe1VLDrcD9Zx77qTFH/eIKtK6pcq6xXFPLz4cuX8/1nOsg4Du/Z0kRdmfsrCNZaHj7cT2VRkEtWVLsy5q1XrmR0Mk11nl+dfteW5nmPMTqZ5qFD/WxfW5P3ZHKhCAV8fOrGdVg7dd5PRERERH5FydcsLK8u5oaN9Tz4Uj+hgI+g30c8laGuLMwNGxrY3znK4b4xHAfW1JVwfnPFnFesLmypZHNTBRbvikeks5b7D/RRXx52LfkK+H05rWoIU0nkZDpLJOB39Ya/YzjOAwd6aSwLK/maBWMMWnAUEREReT2Vmp8lay0nh+I8dyrKRCrDhsYy6krDfOfpdkbjaUIBH8ZMFXwoDvn56PZVrK7LfbW/meofTxD2+2dUSKEQZR3LD55t57lTIyyrjHD7m1e7VujDcSwd0ThNlUU5qQwpIueWSGfJOJbSsLuvHXYMx7EWltcU/tlaEREpPCo17xFjDKtqS1hVWwLAWCLNP/zyMADNVUWvem4skeGOR9r49FvOK9iVk0KNa6a6RybZcypKc2URHdE4L3aNcuWaWlfG9vkMK2pKXBlLRNxx52Mn6BmZ5C/eOff+ia81lkjzTw8dxVr407dvzPu2aRERWbz0cv487TkZJZ7MUlX8+n+sSyMBrIUnjg3OetyhWJLvPHWKbz5+gu6RSTdCfUPjiTTHB2KkMo6n8/SOJvjF/m6eOxXFrRXX0nCAgN/H0EQKa8lreXsR8d75TeVsWV75cjEiN0QCfpori2isiKhQjIiIeEorX/P0zMlhKs+yZa+mNMTTJ6K8e0vzrCrvfeuJk/SOJQj4fJwaauPP37HRk61vk6ksX3zwKNGJFOc3l3Pb1atdn+P0PF/ZeYzJVJZ01mIMXLy8at7jVpWEuG37Sp49GWVVXQnnN3nXANhay45D/ZSEAly5psazeU5LpLP8Yn8PF7ZUuNZYW2Shu3Z9vetjhgI+PnnDOkAVUkVExFtKvuYpkcqetahGwGdIZbJYy4yLEFhr6R1LUFcWxm8MXSOTJNJZT5KvsUSakXiKyuIgx/snXB//tPFkmngqQ1NFEd2jCQbGk66NfV5jOec1epd0nZZ1LA8d7KOqJJST5GsknuaRIwNYUPIl4jElXSIikgtKvuapuaqIU0NxqgNnPiMQS2ZoKI/MqgqfMYZr19ex42A/AJcsr3L9cPlp9WVhrttQz77OUT5wUZMncwDUloS5sKWSfZ2jlEUCbFle6dlcXgn4ffz3t27wrPrkazVWRPjM2zaccUtrLmWyDiOTac8aJg9PpCgNB1w7vyMiIiJSqFTtcI6stVgLR/rHuWNXGy1VRa975dRaS0d0kg9d1splq2a3UmKt5dRQnKy1rKwpydkNv5ccxzIcn7rRXihNowXuP9DLL/b38N/esp7WancrwQ2MJ/ncvQe5ak0t79va4urYIiIiIrmiaoce6RtL8MiRAfacipLOOjRXFlFfHqZ9eIKG8qKXk4pUxqF/PMGGxjK2tM7+bJMxhpW1i6vSns9nPFk9cRxL/3iSqpIg4YCSOretrith64qqs55tnKuySICtK6rY0KhtlSIiIrL4KfmahbaBGP/ySBsAtSUh/D7DWCJDdCJFTWmIeCrDcDyFAYJ+HzdubOD6jfXaTuWxe17oZteRfpZXl/DJG9YtilXCQrK2vsyzM2eRoJ8PX77Ck7FFRERECo1nyZcxphX4NtAIOMAd1tovGWOqgR8AK4GTwAettVGv4nBLOuvw7SdPURLyv6qceUVRkLJIgI7hOL+zfSX1ZREsUFsacmUVJpHOMjSRoqYkpK16b+DEYIyg30dXdJJkJktxSK8piIiIiEjh8XJJJgP8N2vtRuAK4BPGmE3AZ4Ed1tp1wI7pjwve4d5xYonMGftI+YyhvCjI48eGaKosormyyJXEazyR5osPHuGLDxzhH+8/zGg8Pavrn2+PsuNgH8lMdt6xzIa1lgPdo3RG4zmZ7/1bW9m4rJwPXdaqxEtE5qRtIMax/vF8hyEiIoucZ8mXtbbHWvvc9PvjwEGgGXg38K3pp30LeI9XMbipbyxx1lLx5ZGg68nGkb4Y/eNJmiqLGIqlONg7NuNrB2NJ/v2pU/zkuS5e7Bp1Na5zaRuc4F92HecrO4+Tdbwv6NJaXcxHt69i68pqz+cSkcVndDLNVx6e+n/WYMy9NhgiIiKvlZNlAmPMSuBi4GmgwVrbA1MJmjHmjB0zjTG3A7cDLF++PBdhnlUk4Mc5S2XIdCaL3xgyWYeAS/24KoqCWAvRiRQWO6uCB6XhAMsqIkQnUtSXRVyJZ6ZqS8OsqCmhqaIIHb8SkUJXHPJzXmMpmaz1rK2HiIgI5KDUvDGmFNgF/I219ifGmBFrbeUrPh+11p61HGAhlJofGE/yt/ceYlllBN+rlsAsXdFJXugcobY0xOq6Mq5dX8e16+vn3RTZWsvuk8O82D3GxmXlXL6qelaNQDNZh6y1qgA4R0OxJD/b18ONmxporizKdzgiIiIiUqBmWmre0zJ8xpgg8P8D/2Gt/cn0w33GmGXTn18G9HsZg1vqysJcuqqarmgc5xVb6dqH4jx3aoSg38cFLVWURQLcu7+XnzzXNe85jTFcuqqGj25fxRWra2aVeMFUU2AlXnPXP55kz6lhOodzc3ZNRERERBY3z5IvM5Up3AkctNZ+/hWfuge4dfr9W4G7vYrBbe+9pJmr1tbSM5qgMxqnfWiCFzpHKS8KcMXqGoqCfsIBP81VRTx7YpiB8bOfHegfS/CNx07w4ME+ct3sejSeZvfJYSaSGdfHTqSz/PJAL8+eGMrZ15XKOPz0+U6+89QpxhOzK0zyRjY0lvE/btnEpTk6S3akd4z/fKGbkXgqJ/OJzNfAeJKYB/8PySXHsXRG42SyTr5DERGRJcDLze3bgd8C9htj9k4/9mfA54AfGmNuA9qBD3gYg6uCfh/v39rKDRsaOD4wVQxjMp3lvIayV61K+YzBmKkiHXVlb9xU+L4DvRzoHuXFrlE2N1XQWJG7s1n/ua+bBw/28f6tLdxyYZOrY+/rHOWevd2Egz5W15VS40Fj5dc62j/OrsMDOBaaKiJcv7Fh3mMaY3J2Xs5xLHc+fpKJZIZkJsv7t7bmZF6RuZpIZvj7Xx5iTV0p/+WaNfkOZ84O9o7xlYeP8ZuXr+CqtbX5DkdERBY5z5Iva+1jwBvtk7vBq3lzoaokxLaSakbjaR4+1I/l9V+ohXP25VpTW8rejhFqy8JUFM28mIYbNjeVc2p4gvPqS10fe1lFhIriIHWlYUojuTm8Xlc2NVcq49BStfDOZ/l8hvOXlbOva8SzhsYibioK+rnmvLoFfx6yubKIa9bXs8aD/xeKiIi8lucFN9xQCAU33sg3HmvjcF+MxvJfrZCMxFMEAz4+e/OGs1Y+tNYyPJGiJBxYdA2U01kHvzH4cljuMJbMkM1aKmZRFbKQWGtJZy2hgKdHMUVERETEZQVRcGMpeN/WVurLwnRG49P/TeIzho9uX3XOkvPGGGpKw4su8YKpLZq5TLxgqrz+Qk28YOr3QYmXiIiIyOKlhibzVFEU5FM3nsex/hgD4wkqikKsayhdlAnVYmWtpTM6SV3Z4kyERURERKQw6GV2F/h9hvKiACXhAKWRAGGtXiwo3aMJ/uH+w/xif0++QxERERGRRUwrXy7Y1zHCt586BUxVrbt5cyNvOb9xzuOlMg6d0TgN5RFKwt78iA72jHK4N8aFLRWsrsvdQXPHsRjDrHuWeam2NMSNGxvY3FSe71BEREREZBHTEs08WWv58XOdVBUHaa4sYlllhAde6mV0cu69pn60u4N/2nGUf374GFnH/YIo3SOT3PnoSZ5sG+KOR9py1lfq1NAEf37Xfu7aO/MG1Jmsw78/eZJdR7zrxR0O+HnnRU2symESKuK1u57v4u5Z/K2JiIiI95R8uSCRdggHps4K+Y0BDOl5NOzsG0/i8xmGYykyjvuNP+OpDBZLbUmIdNYhnsq6PseZDE+kGJ1M0zk8OeNr0lnLwZ4x2gYmPIxMZPE51DPGoZ6xfIchIiIir6Bth/NkjOGyVVU8fmyIiqIgsWSGlTXFVBWH5jzmhy9bzuPHB7mgqfzlpM5NK2tKuHRlNXs7RnjzeXUsy1Fz54taKvnjG8+jYRbzFYX8/Pk7NqkK4Dn0jyXw+0xOGlrLwvCpm87LdwgiIiLyGurz5YJ01mHXkQGO9cVorIhw06YGz85qibxW1rH8j7v2Ux4J8qdv35jvcERERESWnJn2+VKG4IKg38eNGxu4cWNDvkORJcjvM9xyYRMRrQ6KiIiIFDTdrYksAtvX1rJ1ZXW+wxBZcPrHEvzTjqOcHNS5UhER8Z6SLxERWbLGEmlODE4wFEvmOxQREVkCtO3QJQe6RvnFiz1curKaa9fXz3u84YkUdz52gmQmy8e2r6KpssiFKM+sZ3SSQ73jXLqymlKdVRORJWRtfRl//Z7NFIfcL24kIiLyWlr5csk9L3QzGk/zny90k0jPv3T7i12jdEcnGZtM8+TxIRcifGM/ea6Lf3v8BM+c8HYeEZFCVBIOFFTjdxERWbyUfLnk4uWVxFNZzm+qIOxC4YMVNcWEgz4cB85r8Lb57/a1tVy2qoaNy8o9nUdEREREZCnzrNS8MeYbwC1Av7V28/Rj1cAPgJXASeCD1troucYq9FLzANZa4qksRUE/Pp87r6COxtNkHEe9m+SsrLV856lTlIQDvPeSlnyHIyIiIrLkzLTUvJcrX/8G3Pyaxz4L7LDWrgN2TH+8KBhjKAkHXEu8ACqKg5QXBekbS5DKOK6NK4uLtdAxPEnP6GS+QxERERGRs/CsuoK19hFjzMrXPPxu4Nrp978F7AQ+41UMC10yk+VrO4/TEY2zrKKIT1y3lkjQnUPhB7vHePz4IDdubGBlbYkrY0p++HyG/+fm9ejEioiIiEhhy/WZrwZrbQ/A9Nv5lwVcxHpHE3RGJ2muLKZndJLOaNy1sX/6fBd7O0b4+b4e18aU/An6fQT8OsIpIiIiUsgK9m7NGHO7MWa3MWb3wMBAvsM5q47hOId7x3Ecd8/P1ZSGKY0E6IrGKQ4FqCuLuDb29nU1VJeEuGK1GvOKiIiIiOSCZwU3AKa3Hf7sFQU3DgPXWmt7jDHLgJ3W2vXnGqeQC250j0zyxQePkMo4/PqlrVy5ptbV8UfjaTqicZori6gqCbk6toiIiIiIzF8hFNw4k3uAW6ffvxW4O8fzuy6VcchkLdnpaoduqygOsrm5QomXiIiIiMgC51nBDWPM95gqrlFrjOkE/hL4HPBDY8xtQDvwAa/mz5UVNcX89pUrGE9muGxVYW7hs9by7MlhJlMOb1pX62pFRhERERERmRkvqx1+6A0+dYNXc+aDMYYty6s8n+f09lBjZp84xZIZvvdMBxnHYU19CS1VxW6HJyIiIiIi5+BZ8iXQP5YAA/XzLJTxYuco33u2HZ8xfPiy5WxsKp/V9aXhAG85v4F4MktDuXtFO0REREREZOYKttrhQtc1Msk/3H+Yz99/hL6xxJzHSWcd/uOZU5SEAoQDPr7z9KlZV1U0xvC2zct439YWgipHLiIiIiKSF1r58kgm65B1LNZAZh4l6K2FrAN+n8EHZJMW7+pTioiIiIiIV5R8eWRFTQmfuG4tPmNoriya8zihgI/3XtLMT57rBOCD21rxq2CG5MjRvnFKwgGa5vE7LCIiIiJTlHx5aHVdqSvjXLG6hi2tlQBEgn5XxhQ5l8lUlq/tOk5LVRF/fNM52/GJiIiIyDko+VoglHRJrkWCPn77yhVUFKnHnIiIiIgblHyJyBkZY7io1fs2CiIiIiJLhZIvD43G02Cgoig462v7xhLc92Iv6azDTZsaWFFT4kGEIiIiIiKSK6o77pH9nSP8r1+8xN/8/CUO9YzN6tpEeuqszaGeMU4OTvC1XceJTqRcictxLA8e7GPHwb6XGzcvBWOJNHtORUllnHyHIiIiIiJLlJIvj+w+FSXs9xHw+XiuPTqra6PxFBPJDPXlEWpKw2Qdy0As6UpcgxNJ7tnbzd17uxl2KaFbCJ5pG+Zru47TNhjLdygiIiIiskRp26FHLllexUs9Yxjg4uWVs7q2qjhEccjPwHgSv8/g9xnqSsOuxFVbEuaWC5dhzNQ8S8Wlq6qpLA6yutadCpQiIiIiIrNlFsLWs23bttndu3fnO4xZi06kMAYq55Dk9I4muPfFnukzX42sqtWZLxERERGRQmSM2WOt3Xau52nly0NVJXNfWWqsiPDR7atcjObMBmNJgj4fFcWzLwoiIiIiIiIzp+RrCXuqbYgf7+nE7zPc/ubVrHGpKbSIiIiIiLyeCm54IJ11GIwl51VZL5nJsr9zlL0dI8RTGRej+5U9p4YpCfmx1vJi16gnc4iIiIiIyBStfLlsNJ7mq7uOMxRLUl0S4uPXrJn19sN01uHrj57geH8MDDSWR/jEdWspCbv747p8VQ0/eLadgM/HRS2zKwoi4pb+8QTlkSCRoD/foYiIiIh4Ki/JlzHmZuBLgB/4urX2c/mIwwtPnRhiMJakubKI7pFJHjs2yDsvaprVGKeGJjgxMPF/27v3UL/rOo7jz9c5Z1fnZVexzTnN84cLzESWpoQukZniiiyUIi1BokKDLlj/SIVJQRmRXaQkK83EskYIOrxgRJmaltoU56Wcky3ZvKSiTd/98ftu/dim7qzfvt/jOc8H/Ph9P5/fl533xmvn+3t/ryyaPYMkPL75BR7c8BxHLp490FqPWjKH0QV7MzwcZg24sZN2xUMbnuN7tz7M6IJZfPKEQ7suR5IkaY9q/bTDJMPApcDJwFLgzCRL265jTxkK2x5eXAXDu/EvnISi7y6UBRlQfdvbd+YUGy91ZngoDAWmjngGtCRJmvi6+Na9DFhbVY8AJLkaWAn8vYNaBu6Yt87j/vXP8sTmF3jL7BkcNzp/zH/GQXNmsvSAfbY9J2zx3JkcdsA+gy9W6tgh82dx4WlvY6anHEqSpEmgi+ZrIfB433gd8M7tV0pyLnAuwOLFi9upbABmTRvhvOWjvPifV5gxZZihobEfsxoZHuKsdy3h0aee59WCJfNmMm3EL6eamPaZ7mMOJEnS5NBF87WzbmSHJz1X1WXAZdB7yPKeLmqQhobyf98cY2R4iNH99x5QRZIkSZK61sWFFuuAA/vGi4D1HdQhSZIkSa3povm6AxhNcnCSqcAZwKoO6pAkSZKk1rR+2o6lGdYAAAX2SURBVGFVbUnyaeAGereav7yq7m+7DkmSJElqUyf3GK+q64Hru/jZkiRJktQFH64jSZIkSS2w+ZIkSZKkFqRq/N/FPcm/gH+09OPmAU+19LM0cZkjDYI50iCYIw2COdKgTNQsHVRV899opTdF89WmJHdW1VFd16E3N3OkQTBHGgRzpEEwRxqUyZ4lTzuUJEmSpBbYfEmSJElSC2y+dnRZ1wVoQjBHGgRzpEEwRxoEc6RBmdRZ8povSZIkSWqBR74kSZIkqQU2X5IkSZLUApuvRpIVSR5MsjbJBV3Xo/EryeVJNia5r29uTpLVSR5q3mc380nynSZXf0tyZHeVazxJcmCSW5KsSXJ/kvObebOkXZZkepI/J/lrk6MvN/MHJ7m9ydEvk0xt5qc147XN50u6rF/jS5LhJHcn+V0zNkcasySPJbk3yT1J7mzm3LY1bL7o/bIBLgVOBpYCZyZZ2m1VGsd+AqzYbu4C4KaqGgVuasbQy9Ro8zoX+H5LNWr82wJ8tqoOA44GPtX83jFLGouXgOVV9XbgCGBFkqOBrwOXNDnaDJzTrH8OsLmqDgUuadaTtjofWNM3NkfaXSdU1RF9z/Ny29aw+epZBqytqkeq6mXgamBlxzVpnKqq24BN202vBK5olq8A3tc3/9Pq+ROwX5ID2qlU41lVPVlVf2mWn6P3hWchZklj0OTh381wSvMqYDlwbTO/fY625uta4D1J0lK5GseSLAJOAX7UjIM50uC4bWvYfPUsBB7vG69r5qRdtX9VPQm9L9XAgmbebOkNNafsvAO4HbOkMWpOFbsH2AisBh4Gnq6qLc0q/VnZlqPm82eAue1WrHHq28AXgFeb8VzMkXZPATcmuSvJuc2c27bGSNcFjBM721vjPfg1CGZLryvJLOBXwGeq6tnX2XlslrRTVfUKcESS/YDrgMN2tlrzbo60gySnAhur6q4kx2+d3smq5ki74tiqWp9kAbA6yQOvs+6ky5JHvnrWAQf2jRcB6zuqRW9OG7YeJm/eNzbzZkuvKckUeo3XlVX162baLGm3VNXTwK30riHcL8nWHaz9WdmWo+bzfdnxNGpNPscCpyV5jN6lF8vpHQkzRxqzqlrfvG+kt0NoGW7btrH56rkDGG3u6jMVOANY1XFNenNZBZzVLJ8F/LZv/qPN3XyOBp7Zethdk1tzfcSPgTVV9a2+j8ySdlmS+c0RL5LMAE6kd/3gLcDpzWrb52hrvk4Hbq6qCb2XWW+sqr5YVYuqagm970A3V9WHMUcaoyR7Jdl76zJwEnAfbtu2if9XepK8l95enmHg8qq6qOOSNE4l+QVwPDAP2ABcCPwGuAZYDPwT+GBVbWq+YH+X3t0RXwA+VlV3dlG3xpckxwG/B+7lf9dYfInedV9mSbskyeH0Ll4fprdD9Zqq+kqSQ+gdwZgD3A18pKpeSjId+Bm9aww3AWdU1SPdVK/xqDnt8HNVdao50lg1mbmuGY4AV1XVRUnm4rYNsPmSJEmSpFZ42qEkSZIktcDmS5IkSZJaYPMlSZIkSS2w+ZIkSZKkFth8SZIkSVILbL4kSRNKkvOSrElyZde1SJLUz1vNS5ImlCQPACdX1aO7sO5IVW1poSxJkhjpugBJkgYlyQ+AQ4BVSX4OrARmAC/Se3jng0nOBk4BpgN7AcuTfB74EDANuK6qLuyifknSxGbzJUmaMKrqE0lWACcALwPfrKotSU4EvgZ8oFn1GODwqtqU5CRgFFgGhF7j9u6quq2Dv4IkaQKz+ZIkTVT7AlckGQUKmNL32eqq2tQsn9S87m7Gs+g1YzZfkqSBsvmSJE1UXwVuqar3J1kC3Nr32fN9ywEurqoftleaJGky8m6HkqSJal/giWb57NdZ7wbg40lmASRZmGTBHq5NkjQJ2XxJkiaqbwAXJ/kDMPxaK1XVjcBVwB+T3AtcC+zdTomSpMnEW81LkiRJUgs88iVJkiRJLbD5kiRJkqQW2HxJkiRJUgtsviRJkiSpBTZfkiRJktQCmy9JkiRJaoHNlyRJkiS14L8TJWUrg7Ae5gAAAABJRU5ErkJggg==\n",
      "text/plain": [
       "<Figure size 864x360 with 1 Axes>"
      ]
     },
     "metadata": {
      "needs_background": "light"
     },
     "output_type": "display_data"
    }
   ],
   "source": [
    "df.plot.scatter([\"fare\", \"age\"], buckets=30)\n",
    "df.plot.scatter([\"fare\", \"age\"], buckets=30, output_format=\"image\", output_path=\"images/scatter.png\")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 35,
   "metadata": {
    "lines_to_next_cell": 0
   },
   "outputs": [
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "INFO:optimus:Using 'column_exp' to process column 'age' with function _cast_to\n",
      "INFO:optimus:percentile() executed in 6.89 sec\n",
      "INFO:optimus:Using 'column_exp' to process column 'age' with function _cast_to\n",
      "INFO:optimus:percentile() executed in 3.92 sec\n"
     ]
    },
    {
     "data": {
      "text/html": [
       "<img src='images/box.png'>"
      ],
      "text/plain": [
       "<IPython.core.display.HTML object>"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "data": {
      "image/png": "iVBORw0KGgoAAAANSUhEUgAAAXQAAAEICAYAAABPgw/pAAAABHNCSVQICAgIfAhkiAAAAAlwSFlzAAALEgAACxIB0t1+/AAAADl0RVh0U29mdHdhcmUAbWF0cGxvdGxpYiB2ZXJzaW9uIDMuMC4zLCBodHRwOi8vbWF0cGxvdGxpYi5vcmcvnQurowAAD11JREFUeJzt3X+s3XV9x/Hn695eUqhiS7kaVn5ctrJZ6TI1N0ZLN20x2SzL0EyDbFNm7tI0hsrWLbPu/qHL1qUms4506yZ4XUhmSxVFiDQq02JScMTbwRxw56gUC1j1GkFZoXrbfvbHPSVFW++5595zT/s5z0fS3Hu+5/vtefefZ7/5nHO+35RSkCSd+Xo6PYAkaXYYdEmqhEGXpEoYdEmqhEGXpEoYdEmqhEGXpEoYdEmqhEGXpEoYdFUvycYk30rybJJHkrytsb03yUeS/CDJ/iTXJylJ5jWef1mSkSQHkzyV5G+T9Hb2XyOd2rxODyDNgW8Bvwl8F3gH8G9JlgJXA28BXg0cAj79M8fdAnwPWAosAD4PPAF8bG7GlqYnXstF3SbJg8AHgRuAnaWUjzW2vxm4G+gDFgMHgIWllOcbz18LrC2lrOrI4NIUPENX9ZK8G9gADDQ2vQQ4H/glJs+4jzvx90uYDPvBJMe39fzMPtJpxaCrakkuAW4GrgS+Vko52jhDD3AQuPCE3S864fcngJ8A55dSjszVvNJM+KaoarcAKMA4QJL3AMsbz30KuCHJkiQLgfcfP6iUchD4EvCRJOcm6UnyK0neOLfjS80z6KpaKeUR4CPA15h8g/PXgXsbT9/MZLS/ATwA7AKOAEcbz78bOAt4BHgauA24YK5ml6bLN0WlhiRvAf6llHJJp2eRWuEZurpWkrOTrEkyL8kSJj/5cnun55Ja5Rm6ulaSc4CvAq8EngfuAm4opfy4o4NJLTLoklQJl1wkqRJz+jn0888/vwwMDMzlS0rSGW/v3r0/KKX0T7XfnAZ9YGCA0dHRuXxJSTrjJfl2M/u55CJJlTDoklQJgy5JlTDoklQJgy5JlWgq6En+LMnDSR5KsiPJ/CSXJrk/yaNJdiY5q93DSrNtx44dLF++nN7eXpYvX86OHTs6PZLUsimD3rjGxfuAwVLKcqAXeCfwYeCjpZTLmLwS3VA7B5Vm244dOxgeHmbr1q0cPnyYrVu3Mjw8bNR1xmp2yWUecHbj5rnnMHljgNVMXk4UJu+9+NbZH09qn02bNjEyMsKqVavo6+tj1apVjIyMsGnTpk6PJrVkyqCXUp4C/p7J+yseBH4E7AWeOeFOLk8CS052fJK1SUaTjI6Pj8/O1NIsGBsbY+XKlS/atnLlSsbGxjo0kTQzzSy5LGLy7uiXMnkPxgVM3in9Z530Kl+llJtKKYOllMH+/im/uSrNmWXLlrFnz54XbduzZw/Lli3r0ETSzDSz5PJmYH8pZbyUMgF8FlgBLGwswcDkfRm/06YZpbYYHh5maGiI3bt3MzExwe7duxkaGmJ4eLjTo0ktaeZaLgeA1zeuHf08kzfbHQV2A28HbgWuA+5o15BSO1x77bUArF+/nrGxMZYtW8amTZte2C6daZq6HnqSvwauYfJ+iw8Af8LkmvmtwHmNbX9USvnJL/p7BgcHixfnkqTpSbK3lDI41X5NXW2xlPJBJm/PdaLHgNe1MJskqQ38pqi6ml8sUk3m9Hro0unk+BeLRkZGWLlyJXv27GFoaPL7ca6j60w0p/cUdQ1dp5Ply5ezdetWVq1a9cK23bt3s379eh566KEOTia9WLNr6AZdXau3t5fDhw/T19f3wraJiQnmz5/P0aNHOziZ9GLNBt01dHUtv1ik2riGrq41PDzMNddcw4IFCzhw4AAXX3wxhw4d4sYbb+z0aFJLPEOXgLlcepTaxaCra23atImdO3eyf/9+jh07xv79+9m5c6dXW9QZyzdF1bV8U1RnCt8Ulabgm6KqjUFX1/Jqi6qNn3JR1/Jqi6qNZ+jqavfddx/79u3j2LFj7Nu3j/vuu6/TI0ktM+jqWuvXr2fbtm0sWrSInp4eFi1axLZt21i/fn2nR5Na4qdc1LX6+vro7e3l2LFjTExM0NfXR09PD0ePHmViYqLT40kv8FMu0hSOHDnCxMQEmzdv5tChQ2zevJmJiQmOHDky9cHSacigq6utWbOGDRs2cM4557BhwwbWrFnT6ZGklhl0dbVdu3axZcsWnnvuObZs2cKuXbs6PZLUMtfQ1bVcQ9eZwjV0aQrr1q1jYmKCxYsX09PTw+LFi5mYmGDdunWdHk1qiV8sUtfaunUrADfffDPHjh3j6aef5r3vfe8L26UzjWfo6morVqxg6dKl9PT0sHTpUlasWNHpkaSWeYauruVNolUb3xRV1/Im0TpTeJNoaQpeD11nCj/lIk3B66GrNq6hq0pJmtpv9erVMzree5HqdOIZuqpUSmnqz/bt27n88sshPVx++eVs37696WONuU43rqFLwMDGu3h881WdHkM6KdfQJanLGHRJqoRBl6RKGHRJqoRBl6RKGHRJqoRBl6RKGHRJqkRTQU+yMMltSf4nyViSNyQ5L8ndSR5t/FzU7mElSafW7Bn6jcAXSimvBH4DGAM2Al8upVwGfLnxWJLUIVMGPcm5wG8BIwCllJ+WUp4BrgZuaex2C/DWdg0pSZpaM2fovwyMA/+a5IEkH0+yAHhFKeUgQOPny092cJK1SUaTjI6Pj8/a4JKkF2sm6POA1wL/XEp5DXCIaSyvlFJuKqUMllIG+/v7WxxTkjSVZoL+JPBkKeX+xuPbmAz895JcAND4+f32jChJasaUQS+lfBd4IsmvNTZdCTwC3Alc19h2HXBHWyaUJDWl2TsWrQc+meQs4DHgPUz+Z/CpJEPAAeAd7RlRktSMpoJeSnkQONnF1a+c3XEkSa3ym6KSVAmDLkmVMOiSVAmDLkmVMOiSVAmDLkmVMOiSVAmDLkmVMOiSVAmDLkmVMOiSVAmDLkmVMOiSVAmDLkmVMOiSVAmDLkmVMOiSVAmDLkmVMOiSVAmDLkmVMOiSVAmDLkmVMOiSVAmDLkmVMOiSVAmDLkmVMOiSVAmDLkmVMOiSVAmDLkmVMOiSVAmDLkmVMOiSVAmDLkmVMOiSVAmDLkmVMOiSVImmg56kN8kDST7feHxpkvuTPJpkZ5Kz2jemJGkq0zlDvwEYO+Hxh4GPllIuA54GhmZzMEnS9DQV9CQXAlcBH288DrAauK2xyy3AW9sxoCSpOc2eof8D8JfAscbjxcAzpZQjjcdPAktOdmCStUlGk4yOj4/PaFhJ0qlNGfQkvwt8v5Sy98TNJ9m1nOz4UspNpZTBUspgf39/i2NKkqYyr4l9rgB+L8kaYD5wLpNn7AuTzGucpV8IfKd9Y6qbXbH5Kzz1zPNtf52BjXe19e9fsvBs7t24uq2voe42ZdBLKR8APgCQ5E3AX5RS/jDJp4G3A7cC1wF3tHFOdbGnnnmexzdf1ekxZqzd/2FIM/kc+vuBDUn2MbmmPjI7I0mSWtHMkssLSin3APc0fn8MeN3sjyRJaoXfFJWkShh0SaqEQZekShh0SaqEQZekShh0SaqEQZekShh0SaqEQZekShh0SaqEQZekShh0SaqEQZekShh0SaqEQZekShh0SaqEQZekShh0SaqEQZekShh0SaqEQZekShh0SarEvE4PIE3l8fl/AB/q9BQz9/h8gB91egxVzKDrtDdweDuPb76q02PM2MDGu3i800Ooai65SFIlDLokVcKgS1IlDLokVcKgS1IlDLokVcKgS1IlDLokVcKgS1IlDLokVcKgS1IlDLokVcKgS1Ilpgx6kouS7E4yluThJDc0tp+X5O4kjzZ+Lmr/uJKkU2nm8rlHgD8vpfxnkpcCe5PcDfwx8OVSyuYkG4GNwPvbN6q61ZKFZzOw8a5OjzFjSxae3ekRVLkpg15KOQgcbPz+bJIxYAlwNfCmxm63APdg0NUG925c3fbXGNh4VxXXXFd3m9YaepIB4DXA/cArGrE/Hv2Xn+KYtUlGk4yOj4/PbFpJ0ik1HfQkLwE+A/xpKeXHzR5XSrmplDJYShns7+9vZUZJUhOaCnqSPiZj/slSymcbm7+X5ILG8xcA32/PiJKkZjTzKZcAI8BYKWXLCU/dCVzX+P064I7ZH0+S1KxmPuVyBfAu4L+TPNjY9lfAZuBTSYaAA8A72jOiJKkZzXzKZQ+QUzx95eyOI0lqld8UlaRKGHRJqoRBl6RKGHRJqoRBl6RKGHRJqoRBl6RKGHRJqoRBl6RKGHRJqoRBl6RKGHRJqoRBl6RKGHRJqoRBl6RKGHRJqoRBl6RKGHRJqoRBl6RKGHRJqoRBl6RKGHRJqoRBl6RKGHRJqoRBl6RKGHRJqoRBl6RKGHRJqoRBl6RKGHRJqoRBl6RKGHRJqoRBl6RKGHRJqoRBl6RKGHRJqoRBl6RKzCjoSX4nyTeT7EuycbaGkiRNX8tBT9IL/BPwFuBVwLVJXjVbg0mSpmcmZ+ivA/aVUh4rpfwUuBW4enbGkiRN10yCvgR44oTHTza2vUiStUlGk4yOj4/P4OUkSb/ITIKek2wrP7ehlJtKKYOllMH+/v4ZvJwk6ReZSdCfBC464fGFwHdmNo4kqVUzCfrXgcuSXJrkLOCdwJ2zM5YkabrmtXpgKeVIkuuBLwK9wCdKKQ/P2mSSpGlpOegApZRdwK5ZmkWSNAN+U1SSKmHQJakSBl2SKmHQJakSBl2SKmHQJakSBl2SKmHQJakSM/pikXS6Sk527bgpjvnw9F+nlJ+7Hp3UMQZdVTK06kYuuUhSJQy6JFXCoEtSJQy6JFXCoEtSJQy6JFXCoEtSJQy6JFUic/kFjCTjwLfn7AWl5p0P/KDTQ0incEkppX+qneY06NLpKsloKWWw03NIM+GSiyRVwqBLUiUMujTppk4PIM2Ua+iSVAnP0CWpEgZdkiph0CWpEgZdkiph0NU1knwuyd4kDydZ29g2lOR/k9yT5OYk/9jY3p/kM0m+3vhzRWenl6bmp1zUNZKcV0r5YZKzga8Dvw3cC7wWeBb4CvBfpZTrk2wHtpVS9iS5GPhiKWVZx4aXmuBNotVN3pfkbY3fLwLeBXy1lPJDgCSfBn618fybgVclOX7suUleWkp5di4HlqbDoKsrJHkTk5F+QynluST3AN8ETnXW3dPY9/m5mVCaOdfQ1S1eBjzdiPkrgdcD5wBvTLIoyTzg90/Y/0vA9ccfJHn1nE4rtcCgq1t8AZiX5BvA3wD/ATwF/B1wP/DvwCPAjxr7vw8YTPKNJI8A6+Z+ZGl6fFNUXS3JS0op/9c4Q78d+EQp5fZOzyW1wjN0dbsPJXkQeAjYD3yuw/NILfMMXZIq4Rm6JFXCoEtSJQy6JFXCoEtSJQy6JFXi/wHno0/sqV6NfgAAAABJRU5ErkJggg==\n",
      "text/plain": [
       "<Figure size 432x288 with 1 Axes>"
      ]
     },
     "metadata": {
      "needs_background": "light"
     },
     "output_type": "display_data"
    },
    {
     "data": {
      "text/plain": [
       "<Figure size 864x360 with 0 Axes>"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "data": {
      "image/png": "iVBORw0KGgoAAAANSUhEUgAAAXQAAAEICAYAAABPgw/pAAAABHNCSVQICAgIfAhkiAAAAAlwSFlzAAALEgAACxIB0t1+/AAAADl0RVh0U29mdHdhcmUAbWF0cGxvdGxpYiB2ZXJzaW9uIDMuMC4zLCBodHRwOi8vbWF0cGxvdGxpYi5vcmcvnQurowAAD11JREFUeJzt3X+s3XV9x/Hn695eUqhiS7kaVn5ctrJZ6TI1N0ZLN20x2SzL0EyDbFNm7tI0hsrWLbPu/qHL1qUms4506yZ4XUhmSxVFiDQq02JScMTbwRxw56gUC1j1GkFZoXrbfvbHPSVFW++5595zT/s5z0fS3Hu+5/vtefefZ7/5nHO+35RSkCSd+Xo6PYAkaXYYdEmqhEGXpEoYdEmqhEGXpEoYdEmqhEGXpEoYdEmqhEGXpEoYdFUvycYk30rybJJHkrytsb03yUeS/CDJ/iTXJylJ5jWef1mSkSQHkzyV5G+T9Hb2XyOd2rxODyDNgW8Bvwl8F3gH8G9JlgJXA28BXg0cAj79M8fdAnwPWAosAD4PPAF8bG7GlqYnXstF3SbJg8AHgRuAnaWUjzW2vxm4G+gDFgMHgIWllOcbz18LrC2lrOrI4NIUPENX9ZK8G9gADDQ2vQQ4H/glJs+4jzvx90uYDPvBJMe39fzMPtJpxaCrakkuAW4GrgS+Vko52jhDD3AQuPCE3S864fcngJ8A55dSjszVvNJM+KaoarcAKMA4QJL3AMsbz30KuCHJkiQLgfcfP6iUchD4EvCRJOcm6UnyK0neOLfjS80z6KpaKeUR4CPA15h8g/PXgXsbT9/MZLS/ATwA7AKOAEcbz78bOAt4BHgauA24YK5ml6bLN0WlhiRvAf6llHJJp2eRWuEZurpWkrOTrEkyL8kSJj/5cnun55Ja5Rm6ulaSc4CvAq8EngfuAm4opfy4o4NJLTLoklQJl1wkqRJz+jn0888/vwwMDMzlS0rSGW/v3r0/KKX0T7XfnAZ9YGCA0dHRuXxJSTrjJfl2M/u55CJJlTDoklQJgy5JlTDoklQJgy5JlWgq6En+LMnDSR5KsiPJ/CSXJrk/yaNJdiY5q93DSrNtx44dLF++nN7eXpYvX86OHTs6PZLUsimD3rjGxfuAwVLKcqAXeCfwYeCjpZTLmLwS3VA7B5Vm244dOxgeHmbr1q0cPnyYrVu3Mjw8bNR1xmp2yWUecHbj5rnnMHljgNVMXk4UJu+9+NbZH09qn02bNjEyMsKqVavo6+tj1apVjIyMsGnTpk6PJrVkyqCXUp4C/p7J+yseBH4E7AWeOeFOLk8CS052fJK1SUaTjI6Pj8/O1NIsGBsbY+XKlS/atnLlSsbGxjo0kTQzzSy5LGLy7uiXMnkPxgVM3in9Z530Kl+llJtKKYOllMH+/im/uSrNmWXLlrFnz54XbduzZw/Lli3r0ETSzDSz5PJmYH8pZbyUMgF8FlgBLGwswcDkfRm/06YZpbYYHh5maGiI3bt3MzExwe7duxkaGmJ4eLjTo0ktaeZaLgeA1zeuHf08kzfbHQV2A28HbgWuA+5o15BSO1x77bUArF+/nrGxMZYtW8amTZte2C6daZq6HnqSvwauYfJ+iw8Af8LkmvmtwHmNbX9USvnJL/p7BgcHixfnkqTpSbK3lDI41X5NXW2xlPJBJm/PdaLHgNe1MJskqQ38pqi6ml8sUk3m9Hro0unk+BeLRkZGWLlyJXv27GFoaPL7ca6j60w0p/cUdQ1dp5Ply5ezdetWVq1a9cK23bt3s379eh566KEOTia9WLNr6AZdXau3t5fDhw/T19f3wraJiQnmz5/P0aNHOziZ9GLNBt01dHUtv1ik2riGrq41PDzMNddcw4IFCzhw4AAXX3wxhw4d4sYbb+z0aFJLPEOXgLlcepTaxaCra23atImdO3eyf/9+jh07xv79+9m5c6dXW9QZyzdF1bV8U1RnCt8Ulabgm6KqjUFX1/Jqi6qNn3JR1/Jqi6qNZ+jqavfddx/79u3j2LFj7Nu3j/vuu6/TI0ktM+jqWuvXr2fbtm0sWrSInp4eFi1axLZt21i/fn2nR5Na4qdc1LX6+vro7e3l2LFjTExM0NfXR09PD0ePHmViYqLT40kv8FMu0hSOHDnCxMQEmzdv5tChQ2zevJmJiQmOHDky9cHSacigq6utWbOGDRs2cM4557BhwwbWrFnT6ZGklhl0dbVdu3axZcsWnnvuObZs2cKuXbs6PZLUMtfQ1bVcQ9eZwjV0aQrr1q1jYmKCxYsX09PTw+LFi5mYmGDdunWdHk1qiV8sUtfaunUrADfffDPHjh3j6aef5r3vfe8L26UzjWfo6morVqxg6dKl9PT0sHTpUlasWNHpkaSWeYauruVNolUb3xRV1/Im0TpTeJNoaQpeD11nCj/lIk3B66GrNq6hq0pJmtpv9erVMzree5HqdOIZuqpUSmnqz/bt27n88sshPVx++eVs37696WONuU43rqFLwMDGu3h881WdHkM6KdfQJanLGHRJqoRBl6RKGHRJqoRBl6RKGHRJqoRBl6RKGHRJqkRTQU+yMMltSf4nyViSNyQ5L8ndSR5t/FzU7mElSafW7Bn6jcAXSimvBH4DGAM2Al8upVwGfLnxWJLUIVMGPcm5wG8BIwCllJ+WUp4BrgZuaex2C/DWdg0pSZpaM2fovwyMA/+a5IEkH0+yAHhFKeUgQOPny092cJK1SUaTjI6Pj8/a4JKkF2sm6POA1wL/XEp5DXCIaSyvlFJuKqUMllIG+/v7WxxTkjSVZoL+JPBkKeX+xuPbmAz895JcAND4+f32jChJasaUQS+lfBd4IsmvNTZdCTwC3Alc19h2HXBHWyaUJDWl2TsWrQc+meQs4DHgPUz+Z/CpJEPAAeAd7RlRktSMpoJeSnkQONnF1a+c3XEkSa3ym6KSVAmDLkmVMOiSVAmDLkmVMOiSVAmDLkmVMOiSVAmDLkmVMOiSVAmDLkmVMOiSVAmDLkmVMOiSVAmDLkmVMOiSVAmDLkmVMOiSVAmDLkmVMOiSVAmDLkmVMOiSVAmDLkmVMOiSVAmDLkmVMOiSVAmDLkmVMOiSVAmDLkmVMOiSVAmDLkmVMOiSVAmDLkmVMOiSVAmDLkmVMOiSVAmDLkmVMOiSVImmg56kN8kDST7feHxpkvuTPJpkZ5Kz2jemJGkq0zlDvwEYO+Hxh4GPllIuA54GhmZzMEnS9DQV9CQXAlcBH288DrAauK2xyy3AW9sxoCSpOc2eof8D8JfAscbjxcAzpZQjjcdPAktOdmCStUlGk4yOj4/PaFhJ0qlNGfQkvwt8v5Sy98TNJ9m1nOz4UspNpZTBUspgf39/i2NKkqYyr4l9rgB+L8kaYD5wLpNn7AuTzGucpV8IfKd9Y6qbXbH5Kzz1zPNtf52BjXe19e9fsvBs7t24uq2voe42ZdBLKR8APgCQ5E3AX5RS/jDJp4G3A7cC1wF3tHFOdbGnnnmexzdf1ekxZqzd/2FIM/kc+vuBDUn2MbmmPjI7I0mSWtHMkssLSin3APc0fn8MeN3sjyRJaoXfFJWkShh0SaqEQZekShh0SaqEQZekShh0SaqEQZekShh0SaqEQZekShh0SaqEQZekShh0SaqEQZekShh0SaqEQZekShh0SaqEQZekShh0SaqEQZekShh0SaqEQZekShh0SarEvE4PIE3l8fl/AB/q9BQz9/h8gB91egxVzKDrtDdweDuPb76q02PM2MDGu3i800Ooai65SFIlDLokVcKgS1IlDLokVcKgS1IlDLokVcKgS1IlDLokVcKgS1IlDLokVcKgS1IlDLokVcKgS1Ilpgx6kouS7E4yluThJDc0tp+X5O4kjzZ+Lmr/uJKkU2nm8rlHgD8vpfxnkpcCe5PcDfwx8OVSyuYkG4GNwPvbN6q61ZKFZzOw8a5OjzFjSxae3ekRVLkpg15KOQgcbPz+bJIxYAlwNfCmxm63APdg0NUG925c3fbXGNh4VxXXXFd3m9YaepIB4DXA/cArGrE/Hv2Xn+KYtUlGk4yOj4/PbFpJ0ik1HfQkLwE+A/xpKeXHzR5XSrmplDJYShns7+9vZUZJUhOaCnqSPiZj/slSymcbm7+X5ILG8xcA32/PiJKkZjTzKZcAI8BYKWXLCU/dCVzX+P064I7ZH0+S1KxmPuVyBfAu4L+TPNjY9lfAZuBTSYaAA8A72jOiJKkZzXzKZQ+QUzx95eyOI0lqld8UlaRKGHRJqoRBl6RKGHRJqoRBl6RKGHRJqoRBl6RKGHRJqoRBl6RKGHRJqoRBl6RKGHRJqoRBl6RKGHRJqoRBl6RKGHRJqoRBl6RKGHRJqoRBl6RKGHRJqoRBl6RKGHRJqoRBl6RKGHRJqoRBl6RKGHRJqoRBl6RKGHRJqoRBl6RKGHRJqoRBl6RKGHRJqoRBl6RKGHRJqoRBl6RKGHRJqoRBl6RKzCjoSX4nyTeT7EuycbaGkiRNX8tBT9IL/BPwFuBVwLVJXjVbg0mSpmcmZ+ivA/aVUh4rpfwUuBW4enbGkiRN10yCvgR44oTHTza2vUiStUlGk4yOj4/P4OUkSb/ITIKek2wrP7ehlJtKKYOllMH+/v4ZvJwk6ReZSdCfBC464fGFwHdmNo4kqVUzCfrXgcuSXJrkLOCdwJ2zM5YkabrmtXpgKeVIkuuBLwK9wCdKKQ/P2mSSpGlpOegApZRdwK5ZmkWSNAN+U1SSKmHQJakSBl2SKmHQJakSBl2SKmHQJakSBl2SKmHQJakSM/pikXS6Sk527bgpjvnw9F+nlJ+7Hp3UMQZdVTK06kYuuUhSJQy6JFXCoEtSJQy6JFXCoEtSJQy6JFXCoEtSJQy6JFUic/kFjCTjwLfn7AWl5p0P/KDTQ0incEkppX+qneY06NLpKsloKWWw03NIM+GSiyRVwqBLUiUMujTppk4PIM2Ua+iSVAnP0CWpEgZdkiph0CWpEgZdkiph0NU1knwuyd4kDydZ29g2lOR/k9yT5OYk/9jY3p/kM0m+3vhzRWenl6bmp1zUNZKcV0r5YZKzga8Dvw3cC7wWeBb4CvBfpZTrk2wHtpVS9iS5GPhiKWVZx4aXmuBNotVN3pfkbY3fLwLeBXy1lPJDgCSfBn618fybgVclOX7suUleWkp5di4HlqbDoKsrJHkTk5F+QynluST3AN8ETnXW3dPY9/m5mVCaOdfQ1S1eBjzdiPkrgdcD5wBvTLIoyTzg90/Y/0vA9ccfJHn1nE4rtcCgq1t8AZiX5BvA3wD/ATwF/B1wP/DvwCPAjxr7vw8YTPKNJI8A6+Z+ZGl6fFNUXS3JS0op/9c4Q78d+EQp5fZOzyW1wjN0dbsPJXkQeAjYD3yuw/NILfMMXZIq4Rm6JFXCoEtSJQy6JFXCoEtSJQy6JFXi/wHno0/sqV6NfgAAAABJRU5ErkJggg==\n",
      "text/plain": [
       "<Figure size 432x288 with 1 Axes>"
      ]
     },
     "metadata": {
      "needs_background": "light"
     },
     "output_type": "display_data"
    }
   ],
   "source": [
    "df.plot.box(\"age\")\n",
    "df.plot.box(\"age\", output_format=\"image\", output_path=\"images/box.png\")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 36,
   "metadata": {
    "lines_to_next_cell": 0
   },
   "outputs": [
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "INFO:optimus:`name`,`sex`,`ticket`,`cabin`,`embarked`,`boat`,`home_dest` column(s) was not processed because is/are not byte,short,big,int,double,float\n",
      "INFO:optimus:Using 'column_exp' to process column 'pclass' with function _cast_to\n",
      "INFO:optimus:Casting pclass to float...\n",
      "INFO:optimus:Using 'column_exp' to process column 'survived' with function _cast_to\n",
      "INFO:optimus:Casting survived to float...\n",
      "INFO:optimus:Using 'column_exp' to process column 'age' with function _cast_to\n",
      "INFO:optimus:Casting age to float...\n",
      "INFO:optimus:Using 'column_exp' to process column 'sibsp' with function _cast_to\n",
      "INFO:optimus:Casting sibsp to float...\n",
      "INFO:optimus:Using 'column_exp' to process column 'parch' with function _cast_to\n",
      "INFO:optimus:Casting parch to float...\n",
      "INFO:optimus:Using 'column_exp' to process column 'fare' with function _cast_to\n",
      "INFO:optimus:Casting fare to float...\n",
      "INFO:optimus:Using 'column_exp' to process column 'body' with function _cast_to\n",
      "INFO:optimus:Casting body to float...\n",
      "object of type <class 'float'> cannot be safely interpreted as an integer.\n",
      "INFO:optimus:`name`,`sex`,`ticket`,`cabin`,`embarked`,`boat`,`home_dest` column(s) was not processed because is/are not byte,short,big,int,double,float\n",
      "INFO:optimus:Using 'column_exp' to process column 'pclass' with function _cast_to\n",
      "INFO:optimus:Casting pclass to float...\n",
      "INFO:optimus:Using 'column_exp' to process column 'survived' with function _cast_to\n",
      "INFO:optimus:Casting survived to float...\n",
      "INFO:optimus:Using 'column_exp' to process column 'age' with function _cast_to\n",
      "INFO:optimus:Casting age to float...\n",
      "INFO:optimus:Using 'column_exp' to process column 'sibsp' with function _cast_to\n",
      "INFO:optimus:Casting sibsp to float...\n",
      "INFO:optimus:Using 'column_exp' to process column 'parch' with function _cast_to\n",
      "INFO:optimus:Casting parch to float...\n",
      "INFO:optimus:Using 'column_exp' to process column 'fare' with function _cast_to\n",
      "INFO:optimus:Casting fare to float...\n",
      "INFO:optimus:Using 'column_exp' to process column 'body' with function _cast_to\n",
      "INFO:optimus:Casting body to float...\n"
     ]
    },
    {
     "data": {
      "text/html": [
       "<img src='images/correlation.png'>"
      ],
      "text/plain": [
       "<IPython.core.display.HTML object>"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "data": {
      "image/png": "iVBORw0KGgoAAAANSUhEUgAAAYYAAAEdCAYAAAAIIcBlAAAABHNCSVQICAgIfAhkiAAAAAlwSFlzAAALEgAACxIB0t1+/AAAADl0RVh0U29mdHdhcmUAbWF0cGxvdGxpYiB2ZXJzaW9uIDMuMC4zLCBodHRwOi8vbWF0cGxvdGxpYi5vcmcvnQurowAAIABJREFUeJzsnXd4VEUXh9+zqSSh9xIITToECFWkSVdUBKVYQUCaBUUs8CFSBKmighI60hMQQgtNOkjvVUAihN5CCSEkO98fuy7ZZEMWTDZcmfd57sPeO+fOmd8uuWfanRGlFBqNRqPR/IMpvQug0Wg0micLHRg0Go1GY4cODBqNRqOxQwcGjUaj0dihA4NGo9Fo7NCBQaPRaDR26MCg0Wg0Gjt0YNBoNBqNHTowaDQajcYO9/QugCv4s1bjdHm9e3L3D9PDLQCXo26li9/uTWuni1+Ae/fj0s33FzPC0s33uv490su1PM5Nj/L3WHzTisfykd4YXaNuMWg0Go3GjqeixaDRaJ4g5Cmojxpcow4MGo3GtcgT13OS+hhcow4MGo3GpYibW3oXIc0xukYdGDQajWsxeG3aKQyuUQcGjUbjWkzGfmg6hcE16sCg0Whcihi8Nu0MRteoA4NGo3EtJmPP2HEKg2vUgUGj0bgWgz80ncLgGnVg0Gg0LkUM/tB0BqNr1IEhGQrNnoRH/nyouDhO1m+eJj4+bFKLXJkzYlaK+dv2sS/ifBKbvi0a4OnujghcvR3ND+EbMJuhW6NnyZslEwrF3Xv3GbdqM1HRMU75/bpVY/JmzYxZKaau28b2E38nazvy7Zfx8fSk68QQAF6rEUj9MsVxdzMx6fc/+OPPiIf6WjDzV+bPnA5KUTawEl8OGWaXHn3nNr06dyDq2jXcPTz4YvAwSpUrZ0s/fvgQ/T7uQcVq1fl84BAAhn/dl307twOQJWtWhv4yCT8/P7t8F82eyaI5M0ApSleoSC/rvTa/0Xfo07UTUdctfj8d8C0lypTjQmQk3/TsTvSdOxQoFMDgcRNs90z/+UfWr1iOAF4ZMtB32Pfk9fd/qH6AiV1aE5AzG/FmxbCw31lz4HgSmyVfdMLL3R2zsqyk0Dl4LhGXrxPc+XUK5cwGgMkkuJtM1PtmbIo+n2hS8aEpIk2AMYAbMFEpNTRReiFgMpATuAa8qZQ6m2oFSA6Da0zXsCYidUVkSXqWITluhC7iwoBhKRs+JvXLFCNjBm/6zl1O2M6DtKhS3qHdD8s30i8knP/NC8fbw51mgaUBOHTmAt+ErqDfvHBuRN+l3bOVnPLbvHIZsvhmoMuEeczauIu361RJ1rZVtQrcjzfbXTsWeZGfwjdyN/Z+ir7iYmMJnTGNr4YMZ9L8MA4f2MfOrZvtbCZ8P5IMGXyYuXw19Ro35cehA+3SRw/8mmw5ctjOTx0/xp5tf/DL7FBmLF2J2ayYEfxzEr8LZ//KZwOHMm7OAo4e3M/ubVvtbKb8MBrvDBmYHBbOcw0b88swS+Dw8fWh1dsdqF6nnp19bGwsvy9dzKAfxzNx4TJy5cnLlJ++T/E7eLtOFXJk9KPBwJ8Zs2w9vZrXTdZ2eNjvNB78C40H/0LE5esAdA6eZ7u2+ehfnL12I0WfTzwizh8PzUbcgLFAU6A00FZESicyGwFMV0qVBwYAQ3AFBtdo7PZOGhI1P4y4i5fSLP/yhfKx/29LC2HnqbO4mUzkyZwxid3NmHsAuJtMmMTEPytzrT9ykjiz5aF96tJV/Ly9nPJbpVhBWwth07FTuJtMFMieOYldJm8v6pQpxqxNu+yu7//7PIfOXnDK17pVK/Dx9aVMhUC8fXwoVa48K8MW2tkc2LObpi1aAtCuUxeuX7mC2apr1qRgsmbPQe68+RPlrLh96xaxsbHExd0nd958dqmb1qzCx8eXUuUr4O3jQ4my5VizZJGdzeG9e2j88qsAtO7QmWtXLX4zZcnK8y++hKen/ff5T5lu3bqJ2Wwm5u5dsuXImeJ3UL9scdYctLQQlu05grvJjaK5s6d4nyOqFivIst1HHuveJwkRcfpIgarACaXUKaVULDAHeDmRTWlgjfXzWgfpaYLRNaZ6YBCRABE5KiLTRGS/iISKiI+IVBGRLSKyT0S2i0jGRPdVtabvsf5bwnq9jNV+rzW/4iLiKyJLrXkdFJHWqa0jrfH18uRSghVQY+PiyJs1k0Pbvq82ZMDrTYiLj2f53sNJ0isX9uf4+ctO+fXz8uL89Sjb+b37cRTIliWJ3Ucv1GHtwT+JvhfrVL6OOH/2DBkzPdCUM08erl+9amcTc/cuAUWLA+Dp6YmYTJyPjCTq+nVWhi2k94Bv7eyLPFOCyjVq8nH7N3mneRM8vbxo0fYNe7+RZ/BL4DdH7jzcSOw35i4Fixa1+TWZTFyMjExWi7e3N41feZXBn31Mh5eacP3qFTp+9EmK30HmDN78feW67fzu/fsUzZ3DoW2v5vVY0acLI99O+nddpkAevD3cCdm6N0WfTzwmcfoQkc4isjPB0TlBTvmBMwnOz1qvJWQf0NL6uQWQUUQeLzI/CgbXmFYthhJAsLVpcxPoAcwFPlJKVQAaAHcT3XMUqK2Uqgj0A/55InQBxiilAoEgLF9ME+CcUqqCUqosEJ5GOlzKP/3LiRm0YBUDQldiMgm1ShSxS+tQtypKKRbtPPjYfhO7DSriTxbfDCzYvv+x87Tkm1SPM/O7TSYY2vcL6jVpRpZs2ezSLp4/z+F9exk5cSpTFi0jLvY+E8eMTtGvM2+iPmzAMCYmhg2rVvD5t8OYHBZO9py5GNm/b4p5OsLR7/zJtIU0GvwLbcdMp1ieHHzyYl279LfqVOGvS9dsrURD4+bm9KGUClZKBSU4ghPk5OhHTfzl9gLqiMgeoA4QCaT9+uwG15hWg89nlFL/dCbPAPoA55VSOwCUUjchyUMiMzBNRIpjEe5hvb4V6CMiBYAFSqk/ReQAMEJEvgOWKKU2Ji6ANep2BhhQtDRt8hRIbY2PTNuaFSmVPzcA1+/cJVeCriNPd3cu3kh+D4WYuDiOnbtExYD8bDhyCoAWVcpSMEdWhoWtSfY+gPcb1KBCgKWScfVWNHmzPug68vJwJ/K6fb91xcIFyOjtzfjOryNYfqcf2r/Kh1MWPJLefP4F2bB6pe388oULSR703hkycPrknxQrWZLY2FiU2UzuvPm5EHmWiJMnWLHoN1s3zuiB/cmeMycZM2chf8FCAFSqXoOjB+0DWL4CBdm8ZrXt/MpFB369M/D3yZMUecbi12w2kytv3mS17Ni0AYDSFSoCUKdJM8Jmz3Ro269VY54tURiAizduUTBHVltaBg8P/rp0Nck9/7T4rt2OZtPRU5Txz2OXHhiQj5/CNyVbPiORii9/nQUSjv4XAM4lNFBKnQNetfr1A1oqpaJIY4yuMa1aDIkj2k0H1xIzEFhrbQE0B7wBlFKzgJewtDBWiEh9pdRxoDJwABgiIv2SFCBBFH4SggLA7C176BcSTr+QcPb/fY7yBS0PoqAiBYhXZi4k2lzHz8vTNu7gbjLxTN5cXIq6DUC9MsWoVLgAY1dsIjr24ZWD8au30m1iKN0mhrL9RARVixUEoFaJIsSbFWev2v8fmrBmK52D5/J+8DyGhf1OXLz5kYMCQO2GjYm+c4cj+/cREx3NkQP7adjcvpukbGBFlv82H4BZE34ha/YcmEwmpixcyqzwNcwKX0OpchWoWK06Pf/XnwKFArhy6SI3o25gNps5uHc3+f0L2uVZ8/mGREff4eiB/cRER3Ps4AHqN7OfWVY6MJAViyya5k4OJmv27Jge0mLwDwjg7p07RJ6xzMLasXED2XPlcmg7IHSFbcB4zcHjPF/2GQCaVSxFnNnMyYv2gcHD7cE4j6e7O1WLFeKvBDZVihbEw82NJbsOJVs+QyEm54+HswMoLiKFRcQTaAPY7ZgkIjlEbBl9iWX2TtpjcI1p1WIoKCI1lFJbgbbAH8D7IlJFKbXDOr6QuCspM5YmEMC7/1wUkSLAKaXUD9bP5UXkKHBNKTVDRG4ntE8tAkKn454rJ4hQbP0yboav4dKQkamW/5qDf1LWPw+DWjfFrBS/bT9gSxvwWhP6hYTj6+1J10Y1LYNUwOWbdwj5Yx8Az5ctjgAfNHkOgGt3ovl+2YYU/S7edYjKRfwZ3+l1zEoxbf12W9q4jq3oNjH0ofe/WrU8TQJLIiJ0qF+d1jUr0XPabw5tPT09ebXdmwz6vBcKRZnygVSpWYv+n3xIyXLladO+Ix0/7kXvzu15o2kD3D086D3w4RMq6jd9gY2rV9Kt7WsgQtbsOXj/08+S+H2pdTuG9emNAkqVK0/lGs8yuHdPSpQpR6t3OvBuj5706daJDi81wc3dg0/6D7Ld3755Y1sr5d3mjfmwb38qVatBtdp1+V+PLphEyODrS59hKc9KmrZ+B3VKF2X1/7piNitGLF5rS1vRpwuNB/+Cj5cnE99vYx2MhIjL1xm68EEr8I3nKnM0Mu0mQricVFpHSCkVJyI9gBVYpnJOVkodEpEBwE6lVBhQF0vlUQEbgO6p4jwlDK5RHPbH/psMRQKAZVgKWBP4E3gLKAP8CGTAEhQaYBkz6KWUelFEagDTgMvA78BbSqkAEfkSeBO4D1wA2gFVgOGA2Xq9q1JqZ3Jl0lt7ug69tafrMdrWnhHtOjr991ho1kRDLjpkdI1p1WIwK6W6JLq2A6ie6No664G1dfFMgrT/Wa8PIem83BXWQ6PRGA2DLzDnFAbXqN981mg0rsXgm9g4hcE1pnpgUEqdBsqmdr4ajea/gdGXpHYGo2vULQaNRuNaDL6JjVMYXKMODBqNxrWkPEXT+Bhcow4MGo3GtRi8m8UpDK5RBwaNRuNSxM3YtWlnMLpGHRg0Go1rMfgmNk5hcI06MGg0Gpdi9N3NnMHoGnVg0Gg0rsXg/e9OYXCNOjBoNBrXYvCHplMYXONTERjSa82iDmN/SBe/ACfGjkkXv87uO50WJFze2tWs7t4m3XwbDoN3sziFwTU+FYFBo9E8ORi9/90ZjK5RBwaNRuNaDP5WsFMYXKMODBqNxrUY/K1gpzC4Rh0YNBqNSxGD16adwegadWDQaDSuxeD9705hcI06MGg0Gtdi8KmcTmFwjTowaDQalyIG38TGGYyuUQcGjUbjWgxem3YKg2vUgUGj0bgWgz80ncLgGnVg0Gg0LsXoL385g9E1PvWB4cMmtciVOSNmpZi/bR/7Is4nsenbogGe7u6IwNXb0fwQvgGzGbo1epa8WTKhUNy9d59xqzanypIQhWZPwiN/PlRcHCfrN//X+ZnNZkb07smFyDOYTCbadPmAoOfqJLHbuWEtc8aPxWw2kye/P72GjcZkMvHj119x6uhh3Nw9AGjwyqs0ea2t7b7Tx4/yfd/PKV0piI/6DUzie9AnPTh/5m9MJhNv9+hJtTr1kvj+Y90afh07BrPZTF7/gvQd9RMmk4mQycGsW7aEuLj7vPvRp9So1wCAzatXMnv8WNv99V98iY8/7eVQ/+ypk5k1dTJKKQIrV2HQqO/t0heFzmPyz2OJvXePFq3b0vmDj2xprV9ozM2oKDJnzcqcxctT+qqT8MvUqfw8dQooRbXKQQSPGmWX3v3zz9m8fRsigreXF+OGDaNiufIAlKn1LF6engD4+fqyYfGSR/b/RGLw2rRTGFxjuoQ1EXlJRL5IpbxuP+699csUI2MGb/rOXU7YzoO0qFLeod0PyzfSLySc/80Lx9vDnWaBpQE4dOYC34SuoN+8cG5E36Xds5Uetyh23AhdxIUBw1IlL4CV8+cSdf0aI2aG0rJDZ+YFj3VoN2/Cz7zavhMjZoYSdf0aKxfMs6WVqliZETNDGDEzxC4oAEwe+R2Zs2V3mOfSebOIunaVsSFhtOnUlRnjHK8fNfPnH2ndsStjQ8KIunaVZSGzAShRrgJdv+pHBh9fO/sKVasxamYIP4Us4vPvRrFy4XxiY5IG5djYWGZOmcSgUWOYt3wVB/buZuumjXY2RYs/Q6++/ciTL1+S+1u98RZt3m7vsMwpERsby7gpk5kwajRbloezY+8eft+0yc6mSsVA1v62kD2/r+XZqtXo9fXXdum7f1/L7t/X/neCAoCbyfnDqBhcY5qVSkSSbY0opcKUUkPTyrezlC+Uj/1/W1oIO0+dxc1kIk/mjEnsbsbcA8DdZMIkJpT1+vojJ4kzmwE4dekqft5eqVKuqPlhxF28lCp5AezesomKNWthMpmo8Xwj4uPiOHv6lJ3N2dOniI+Pp2aDxphMJirWrMXuzRuTyfEBi2dOJ3O2bOTIncdh+s5NGwiqVQeTycRzjZoSHxfHmVMn7WzOnDpJfHw8tRs3xWQyEVSrDjs2rgegfJVqlK0UlCRfv0yZ8bTWpu/ejU62fGvCl+Hj60eFSpXx8fGhbGBFliwItbMpWyGQ5+o9jzio5b3W7k0yZc708C8hGRaFLyejry9VK1XC18eHoMBAZi+Yb2fzbpu2ZM2SBYB6tWpx6/Zj13MMg5hMTh9GxegaUyyViPiKyFIR2SciB0WktYicFpEc1vQgEVln/dxfRIJFZCUwXUS2iUiZBHmtE5HKIvKuiPwkIpmteZms6T4ickZEPESkqIiEi8guEdkoIiWtNoVFZKuI7BCRgUlL7Dy+Xp5cirplO4+NiyNvVscPgb6vNmTA602Ii49n+d7DSdIrF/bn+PnL/6Y4aUb0rVvkKeBvO/f0zsC5iNN2NuciTuPl5W07z5U/P9G3Hnw3R/fuptcbrRj0YReuXrwIwK2oG2xcsZROn/dN1vedWzfJm8C3l7c3ZyP+srM5G/EXXt4PfOfJX4A7t26mqGvz6hV0b/USI/v0ptErLfFMkMc/nImIIFPmzLbz3Hnycu3KlRTzTg3+ivibLAl858+Tl8sP8T3h1+mUK1XK7lrl5+tTpWEDRv/yc5qV0+WIyfnDqBhcozOlagKcU0pVUEqVBcJTsK8MvKyUagfMAV4HEJG8QD6l1K5/DJVSUcA+4J8O7+bACqXUfSAY+EApVRnoBYyz2owBflZKVQEuOFH+R8KslMPrgxasYkDoSkwmoVaJInZpHepWRSnFop0HU7s4qURSTabEteNkdAO0eu99hs2Yy9Bpc8iYKTM/D7Z0dwQPGUiN+g3JlCX55a4dZZu4Zq4c+k65j/bZBo0ZGxpG9z79WbtsMbcdBROHBUgx61TBkS5HrRKAAcOHE3n+PGMGf2u7FjJ5MrvW/M7Y74YxZdYs/ti1M83K6lJM4vxhVAyu0ZnB5wPACBH5DliilNqY3H9uK2FKqbvWz/OAVcDXWAJEiAP7uUBrYC3QBhgnIn5ATSAkga9/+mmeBVpaP/8KfOeoECLSGegM0OS9HgQ+3wSAtjUrUip/bgCu37lLrgRdR57u7ly8cStpZlZi4uI4du4SFQPys+GIpSumRZWyFMyRlWFha5K9Lz2YOmoYB3ftACBbjpxcOHvGlhYbc5c8BQvZ2ecLKMy9ew/66C9FRuKT0fLd5E1g++Ibb/PzQEtguHw+ksjTp9gYvhSztUtt/LDBiAj7t2+z+M6Vi/MJfN+LiSF/wQA73/4BRbiXYHzgQuRZfDMm7dJLjvJVquHm7s4fmzfRoEmzRHkHsDr8waDxxQvnyZY9h9N5/xuKBBQiLIHvyAvnyZE96VjMpBkzWLB0CaFTpuLn52e7XvqZEgBUrVSJfHnzsmHLVqpXTtqtZjRSeH78JzC6xhRbDEqp41haAQeAISLSD4hLcG/i9vudBPdGAldFpDyWh/8cBy7CgKYiks3q53dr3jeUUoEJjoRt7OSrtw98ByulgpRSQf8EBYDZW/bQLyScfiHh7P/7HOUL5gUgqEgB4pWZC1H2gcHPy9M27uBuMvFM3lxcirL0A9crU4xKhQswdsUmomPjUiqSS3n3k962weKKzz7Hni2bMJvNbF2zEjd3dwoE2Ld6CgQUweTmxtY1KzGbzezZsolKNWsB2I1HrF28ED9r98jQaXMYNec3Rs35jaKlylC6UhDv9+5D58++4qeQRfwUsogqteqwc9N6zGYzG1cux83dDf8iRe18+xcpipubGxtXLsdsNrNz03qCatV+qL7jhw4QGxsLwIkjh4iJjqZEqTJJ7J5v1IToO7c5sHcP0dHRHNy7hxdfaZnELi1o3qgxt+7cYefePdyJjmbn3r20faWFnc3SVasYMyGYccOGU6xwYdv1M5GR3LaON/wVEcH5ixcJqljRJeVOcwzezeIUBteYYotBRPIB15RSM6wzgN4FTmN5iC/nQe09OeYAvYHMSqkDiROVUrdFZDuWLqIlSql44KaI/CUirymlQsQSfssrpfYBm7G0LGYAbzip0yFrDv5JWf88DGrdFLNS/Lb9QfEGvNaEfiHh+Hp70rVRTUQEAS7fvEPIH/sAeL5scQT4oMlzAFy7E833yzb8myIBEBA6HfdcOUGEYuuXcTN8DZeGjHzs/Bq3as3+bVvo9UYrTCYTr3fuZkvr9cZrjJhpaci91rEL88aPI2TiL+TOV4BGLVsDMHnEUG5evwZABh9f3u/zdVInyfBC63bs3rqJ7q+9hMlk4s1uD3bT6/Hay/wUsgiAdl16MGPcD8waP5Y8Bfx54fV2ACyYNomVixagzGam/TCKkMnBjPp1HtvXr2XM130sNTOBxi1a4V+oUBL/nt7etHmnPV9+/AEoRbmKlahRuza9unehXIUKvNO5K6uWLeX7oYMxm838Nm8OSxaEErbWMvDestHz3L0bjVKKF2rX5K33OtHmHedmKXl7e9PlnXd47+OPUUpRtWJF6teuzdvdu1G5QgU+6vw+g0aNxGw20+OLz4EH01K37NjB0DHfW6Y9KsULDRtSv1Ytp7/3J5ondCZOqmJwjeK4fzeBgUhjYDhgBu4DXYEMwCTgIrANCFJK1RWR/sBtpdSIBPfnBiKBgUqpb6zX3rXe08N63gpLN1NdpdR667XCwM9AXsADmKOUGmC9PgtLUJsP9FVKPWh/O+DL2UtTbGGkBU/j1p4ZrDOF0oP03NqzIPHp5ts9p2u6xhzwWP0ll0ePdfrvMWfP7obskzG6xhRbDEqpFcAKB0nPOLDt7+DaxcR+lFJTgakJzkNJ9J9MKfUXloHvxPn9BdRIcCndp71qNJpH4AntPklVDK7R2KXXaDTGQ8T5I8WspImIHBORE8m9NCsir4vIYRE5JCKzUl2PY6eG1vjUL4mh0WhcTCpN0RQRN2As0BA4C+wQkTCl1OEENsWBL4FnlVLXRSRXqjhPCYNr1IFBo9G4FHFLtcdOVeCEUuoUgIjMAV4GEr6B2gkYq5S6DqCUSr0lBR6C0TXqriSNRuNaUu/lr/zAmQTnZ63XEvIM8IyIbBaRP0QkybhlmmBwjbrFoNFoXMqjvPyV8EVVK8FKqeB/kh3ckng2kDtQHKgLFAA2ikhZpdQNpwvxGBhdow4MGo3GtTzCQ9P6gAxOJvks4J/gvABwzoHNH9Zldv4SkWNYHqI7nC7E42BwjborSaPRuBaTyfnj4ewAilsX1vTE8uJrWCKbhUA9AOvCn88Ap0hrDK5Rtxg0Go1rSaV1hJRScSLSA8t7Vm7AZKXUIREZAOxUSoVZ0xqJyGEgHvhMKXU1VQrwMAyuUQcGjUbjUsTNLdXyUkotA5YlutYvwWcFfGI9XIbRNerAoNFoXIvBVx51CoNrfCoCw+Wo5JfSTkvSa70igGLdP0rZKA347fNU2bH1sTh9Ke17CJKjeVC5dPPteFPVJ5gndNeyVMXgGp+KwKDRaJ4c5AndnCY1MbpGHRg0Go1rMXg3i1MYXKMODBqNxrUYfOVRpzC4Rh0YNBqNSxGDb2LjDEbXqAODRqNxLQavTTuFwTXqwKDRaFyLwQdmncLgGnVg0Gg0LuVRFpgzKkbXqAODRqNxLQavTTuFwTXqwKDRaFxL6m1i8+RicI3GLr1GozEcRu9mcQaja3zqA8PXrRqTN2tmzEoxdd02tp/4O1nbkW+/jI+nJ10nhgDwWo1A6pcpjrubiUm//8Eff0Yke6/ZbGZE755ciDyDyWSiTZcPCHquThK7nRvWMmf8WMxmM3ny+9Nr2GhMJhM/fv0Vp44exs3dA4AGr7xKk9fa2u47ffwo3/f9nNKVguj8xf8e9+ug0OxJeOTPh4qL42T95o+dT3K8WzuI7Bl9UUqxfN9RjkTa70Lo7eHOe/Wq4e3hjlJwMeoWMzfvtrOpW7ooVYsWZOnuwxyKvJiiT7PZzORBfbl6LhIxmWj2dkfKVn82id30777h3KmTKGXmy+AZtutLp01g/+YNuHtY/lxKVq5G8w5dnNI7ZUIwUyaMRykIqlqV0WN/tkufN3sWP//wPffu3aPNm2/x4Se9bGnvvfUGJ44fQwEBAYWZOnsuJoMvtQAYvpvFKQyu8T/wv+zxaV65DFl8M9BlwjxmbdzF23WqJGvbqloF7seb7a4di7zIT+EbuRt7P0VfK+fPJer6NUbMDKVlh87MCx7r0G7ehJ95tX0nRswMJer6NVYumGdLK1WxMiNmhjBiZohdUACYPPI7Mmf796vm3AhdxIUBw/51Po6oWbwQft5ejFy6nlUH/qRx+ZIO7bb9GcHIpev5IXwjOTL6UqN4IVuar6cn5QvmIzYuzmm/m5cu5PaN63w2bgqN2r5N+IzJDu3K16xN+z4DHKblzFeAz8ZO4bOxU5wOCrGxsUwO/oXvx/3CivUb2bNrJxvXr7OzKVGyJP8bMJh8+e13a1wYGkLEX3+xauMWVm/cQuTZs8ybPdMpv088Is4fRsXgGp/qwFClWEFbC2HTsVO4m0wUyJ45iV0mby/qlCnGrE277K7v//s8h85ecMrX7i2bqFizFiaTiRrPNyI+Lo6zp+330jh7+hTx8fHUbNAYk8lExZq12L15Y4p5L545nczZspEjdx6nyvIwouaHEXcxbfZLL5k/N0esNfwDZ87jZhJyZvS1s4m5H8fOv84CEGc2cyP6Lll9M9jSW1Uvz65TZzCrxLsbJs+RHX9Qqkp1TCYTgc/VIz4+jot/n05iF/hcPXIXDHh0YcmwbHEYvn5+VAqqgo+PD4GVKjN/7hw7mwoVK1G/YUMDtDvzAAAgAElEQVQk8bx3EeLj44m5G030nTsoZcbfv2CqlS1dEZPzh1ExuMYnolQislBEdonIIev+p4jIeyJyXETWicgEEfnJej2niMwXkR3WI2mfgJP4eXlx/nqU7fze/TgKZMuSxO6jF+qw9uCfRN+LfVxXRN+6RZ4CD3bo8/TOwLmI03Y25yJO4+XlbTvPlT8/0bcerAx7dO9uer3RikEfduHqRcsD9lbUDTauWEqnz/s+dtlcRQZPD67ejradx8bFkyuTX7L2Gb29yJHRl/1/nwegZL5c+Hh6sPn46Ufye/fObXLkyWc79/Ty4lLkmYfckZQr588yrFt7fuz9AZGnTjh1z98Rp8mS5cH/p7z58nHlymWn7n2lZSsKFy1Ck3p1aPZ8XYo/U4JnayftejQiYhKnD6NidI1PRGAAOiilKgNBwIcikh/4H1AdaAgk7HMYA4xWSlUBWgITU7MgiSuiQUX8yeKbgQXb9//bnJNcMSVuRj6kFtzqvfcZNmMuQ6fNIWOmzPw8+GsAgocMpEb9hmTKkvVfli99MDv4XgBMJhPv1A7i5MWrnL1mCd6Nyj9D2K7Dj+7E0ff6CE346k1e5OPR4+k9bgoFi5dk7g9OdrU58CsO93ZPys5tf3Dh/HkWr1zD4pVrOPHnn8ybPcvpMj/RuLk5fxgVg2t8UgafPxSRFtbP/sBbwHql1DUAEQnBso8pQAOgdIJR/0wiklEpZbfpgrXl0Rng2XYdKfnc8wC836AGFQIs/blXb0WTN+uDriMvD3cir9+wK1jFwgXI6O3N+M6vI5Z8+aH9q3w4ZUGKoqaOGsbBXZb9uLPlyMmFsw9qqbExd8lTsJCdfb6Awty7F2M7vxQZiU/GjADkTWD74htv8/NAS2C4fD6SyNOn2Bi+FLPZMgYyZdR3DE6xdK6heaXSFMuTA4Cb0TFk9/OxpXm6u3Hl5h2H93WoU4XbMfdYuPMgAH7enni5u9O2ZiBg+R2aViwF4HAA+rfxP3Jiv2XQOlO2HFy58GD/9Nh798iVr4DTGrLnzmv7/EL79xne7V2n7isYUJhlSxbbzs+fO0f2HDmcunf2zBkULVbcZl+qTBm2bdnM623bOV3uJxWjz9hxBqNrTPfAICJ1sTzsayilokVkHXAMKJXMLSar7d2H5auUCgaCATr+MsdWdRu/eqvNpnnlMtQvW5zZm3dTq0QR4s2Ks1ej7PKZsGYrE9ZY7imWJwefvljPqaAA8O4nvW2fl8+bzaYVy3i1fSe2rV2Nm7s7BQKK2NkXCCiCyc2NrWtWUq1eA/Zs2cRzTZoBlvGHf+zXLl6IX2ZLQBs67UGf9Y9ff4VXhgy0/+RzSKeNehKzePeDGn7NZwKoFJCfNYdOUM4/L/FmxeVbSQND25oV8XAzMXHtNtu12zGxDF+yznb+YZNarDnwZ7Kzklq8/4Ht88aw+exat5qGrd9i/+b1uLm5P9JYwsW/T9vsNywKwcPLy6n7mrzwIiOHfsueXTspUao0e3fvYsDQ75y6t4C/P+FLlxATEwNK8eexY7R8vbXTZX6i+S/MrEoJg2tM98AAZAauW4NCSSzdRxOAOiKSFbiFpcvogNV+JdADGA4gIoFKqb2P43jxrkNULuLP+E6vY1aKaeu329LGdWxFt4mhD73/1arlaRJYEhGhQ/3qtK5ZiZ7TfnNo27hVa/Zv20KvN1phMpl4vXM3W1qvN15jxEzrFNiOXZg3fhwhE38hd74CNGppeRhMHjGUm9evAZDBx5f3+3z9OJJTJCB0Ou65coIIxdYv42b4Gi4NGZkqeW85fpoSeXPy6Qt1UEqxYt8xW1rPZrUZvWwDuTNnxD97FuLizfRsVhuAY+cusWzv0cf2++yLLTi6ewfDu72LmEw0fes9W9rw7u35bOwUAKZ+248LEX8BMPT9tyhQrARvftaX8JlTuXjmNCKCm7sHLTp94MhNEry9vXm3Y2c+6vo+SikqVq5C7br16fpeewIrVeL97h+wdPEihnzTH7PZzNyZM5g/dw7rt+2k+0c92b51K43q1EKAwkWL0rl7j8f+Dp4oDF6bdgqDaxT1CLM70qQAIl7AQiA/lpZCTqA/lq6jXsA54AhwTSnVR0RyAGOxtCjcgQ1KqYfOH0zYYnAlLWsEpodb4Onc2vNhA9lpTbpu7enrnbJR2vBYT7/bG7c4/ffo91xNQz5hja4x3VsMSql7QNPE10Vkp1IqWETcgd+wtBRQSl0B/iNtao3m6SPJ1Nz/IEbXmO6B4SH0F5EGgDeWoLAwncuj0WhSA4NvYuMUBtf4xAYGpVSvlK00Go3hMHht2ikMrvGJDQwajea/yZP6UldqYnSNOjBoNBrXYvAZO05hcI06MGg0Gtdi8IemUxhcow4MGo3GpYj7f/+xY3SNxi69RqMxHgavTTuFwTXqwKDRaFyLwQdmncLgGnVg0Gg0LsXoL385g9E16sCg0Whci8G7WZzC4BqfisDQvWntdPEbFR2TslEakV5rFrX4bmi6+AUo4Ow+CWnANSe2d00z0m+tpMfD4N0sTmFwjcZu72g0GsMhJjenjxTzEmkiIsdE5ISIJKkNiUgXETkgIntFZJOIlE4TUYn9GlyjDgwajca1mMT54yGIiBuWlZabAqWBtg4eirOUUuWUUoHAMGBUWkhKgsE1PhVdSRqN5gki9TaxqQqcUEqdAhCROcDLgG13KKXUzQT2vjjaYzctMLhGHRg0Go1LScVtL/MDZxKcnwWqOfDXHfgE8ATqp5bzh2F0jborSaPRuBaTyelDRDqLyM4ER+cEOTl6+iapLSulxiqligKfA33TSpYdBteoWwwajca1PEJtOuHe7Q44C/gnOC+AZcfH5JgD/Oy083+DwTXqFoNGo3Ep4mZy+kiBHUBxESksIp5AGyDMzpdI8QSnLwB/pqqYZDC6Rt1i0Gg0riWV3gpWSsWJSA9gBeAGTFZKHRKRAcBOpVQY0MO6E+R94DrwTqo4TwmDa9SBQaPRuJZUfCtYKbUMWJboWr8Enz9KNWePgsE16sCg0Whci8HfCnYKg2t86gLDgpm/Mn/mdFCKsoGV+HKI/TIK0Xdu06tzB6KuXcPdw4MvBg+jVLlytvTjhw/R7+MeVKxWnc8HDgFg+Nd92bdzOwBZsmZl6C+T8PPzs8vXbDYz6JMenD/zNyaTibd79KRanXpJyvfHujX8OnYMZrOZvP4F6TvqJ0wmEyGTg1m3bAlxcfd596NPqVGvAQCbV69k9vixtvvrv/gSr77z3kO/g3drB5E9oy9KKZbvO8qRyEt26d4e7rxXrxreHu4oBRejbjFz8247m7qli1K1aEGW7j7MociLD/XnDIVmT8Ijfz5UXBwn6zf/1/klJHjBfIIXhAJQtUxZxn3Zxy79ix/H8PuO7Qjg7u7O4O4fULdyEIMnTSRs/Vqb3b379xn6wcc0qVnTad+/Tp7Ir5MnopSiUpWqfPf9j3bpC+bNYcLYH7l37x6vtX2Drh/1tEu/fOkibV5pTkDhwkyaOfcRlT+ZGH2BOWcwusZ0K72ITPznDT4Rue0Kn3GxsYTOmMZXQ4YzaX4Yhw/sY+fWzXY2E74fSYYMPsxcvpp6jZvy49CBdumjB35Nthw5bOenjh9jz7Y/+GV2KDOWrsRsVswITjopYOm8WURdu8rYkDDadOrKjHE/OCzjzJ9/pHXHrowNCSPq2lWWhcwGoES5CnT9qh8ZfHzt7CtUrcaomSH8FLKIz78bxcqF84mNjU32O6hZvBB+3l6MXLqeVQf+pHH5kg7ttv0Zwcil6/khfCM5MvpSo3ghW5qvpyflC+YjNi4uWT+Pyo3QRVwYkPprHcXGxTF+fii/fNmXDROnsPPIYdbt2mln07Pdm2yfPpNt02fStlET+v9i+f36vNeRbdbrg7t/iJvJ9EhBITY2lumTJjBszE8sWrWWfbt3sXnjBjub4s+U4It+35A3X36HeXz1aU9y5cr1iKqfbO76eDt9GBWja0y3wKCU6qiUOpyyZeqxbtUKfHx9KVMhEG8fH0qVK8/KsIV2Ngf27KZpi5YAtOvUhetXrmA2mwGYNSmYrNlzkDtv4j9ixe1bt4iNjSUu7j658+ZL4nvnpg0E1aqDyWTiuUZNiY+L48ypk3Y2Z06dJD4+ntqNm2IymQiqVYcdG9cDUL5KNcpWCkqSr1+mzHh6egJw9250it9Byfy5OWKt4R84cx43k5Azo32wibkfx86/zgIQZzZzI/ouWX0z2NJbVS/PrlNnMKvUe4k0an4YcRcvpWz4iCxevw4/nwwElSmDj7c3lUuWYu6KcDub3Nmz2z7fuhvtcOb49KWLKVus2CP5XrlsCb5+fgRWDsLHx4fyFSuxKHSenU25wIrUeb6BwxeiVoUv49bNKCpVSfI+k0aTprgkMIiIr4gsFZF9InJQRFqLyDoRCUpgM1JEdovIGhHJab32oYgcFpH91lfBEZH+IvKriPwuIn+KSCdny3H+7BkyZspkO8+ZJw/Xr161s4m5e5eAopbZX56enojJxPnISKKuX2dl2EJ6D/jWzr7IMyWoXKMmH7d/k3eaN8HTy4sWbd9I4vvOrZvkLfBgOrKXtzdnI/6yszkb8Rde3g9qEHnyF+DOrZukxObVK+je6iVG9ulNo1da2gKFIzJ4enD19oMAEhsXT65MfsnaZ/T2IkdGX/b/fR6Akvly4ePpwebjp1Ms15PA6fPnyJIxo+08X65cXL5xPYndZ2NGUfmNtoSsXsWQHh8mST908iTvvPhoXVxnIiLIlDmz7TxP3nxcvXLZqXvj4uL4YcQw+g1Ov9VqNU8vrmoxNAHOKaUqKKXKAuGJ0n2B3UqpSsB64Gvr9S+Aikqp8kCXBPblsczXrQH0E5GkVXQHKAc1XGdeXTeZYGjfL6jXpBlZsmWzS7t4/jyH9+1l5MSpTFm0jLjY+0wcM9qB76T5JvbtqHyOX3y059kGjRkbGkb3Pv1Zu2wx0bcfrWfOnMzSKiaTiXdqB3Hy4lXOXosCoFH5Zwjb5dKG3r/C4W/u4Dsd/tEn7Jo5m1bPN2TwpAl2ab+t/R03NxP1H7Hm/ri/J8Cgfn0oU648pcuWS9lYo0llXDX4fAAYISLfAUuUUhsTPRTNwD8jazOABdbP+4GZIrIQSNjns0gpdRe4KyJrsSw0ZdcnZH2tvDNAn2+/o2W7N8nnX5ANq1fabC5fuJDkQe+dIQOnT/5JsZIliY2NRZnN5M6bnwuRZ4k4eYIVi36zdS2NHtif7DlzkjFzFvIXtPTBV6peg6MH9wMQPPxb9m/fBkC2XLk4f/bBkif3YmLIXzDAzrd/QBHuxTzYw+FC5Fl8E9R2U6J8lWq4ubuzb8cfuPk/eOeleaXSFMtjGRe5GR1Ddj8fW5qnuxtXbt5xmF+HOlW4HXOPhTsPAuDn7YmXuzttawYClsDWtGIpgFQZgE4LCufLz+IN623n5y5dIkeWLMna9377HYLeWmF3be7KcCqVdDwW8zAKBgSwcvlS2/mF8+fInmB86mEcO3yYy5cu0qBmNZQyo5Si+3vvMnbS1Ecuh0bzqLikxaCUOg5UxhIghohIv5Rusf77ApYlZysDu0TEPVF6YvuEPoOVUkFKqaCW7d4EoHbDxkTfucOR/fuIiY7myIH9NGz+st19ZQMrsvy3+QDMmvALWbPnwGQyMWXhUmaFr2FW+BpKlatAxWrV6fm//hQoFMCVSxe5GXUDs9nMwb27ye9fEIDOn33FTyGL+ClkEVVq1WHnpvWYzWY2rlyOm7sb/kWK2vn2L1IUNzc3Nq5cjtlsZuem9QTVevgmQ8cPHbANNp84coiY6GgKP1PCzmbx7sOMXraB0cs2cOTcJUrlzw1AOf+8xJsVl28lDQxta1bEw83E1A0PBmpvx8QyfMk6Rixdz4il67kXF8fyPUee2KAA8MJztbkdHc2uI4eJjolh19EjvN6wsZ3N1v37bJ/HLwjFO0FXXFxcHMcjIuj48quP7LtRk2bcuX2bfXt2Ex0dzf49u3mpZSun7p29cDGrt2xn9ZZtNHnxJQoXLaqDgsZluKTFYO3quaaUmmGdgfRuIhMT0ArLOh/tgE1ime/lr5RaKyKbrNf/6Qx/WUSGYOmCqoulyylFPD09ebXdmwz6vBcKRZnygVSpWYv+n3xIyXLladO+Ix0/7kXvzu15o2kD3D086G2dkpoc9Zu+wMbVK+nW9jUQIWv2HLz/6WdJ7F5o3Y7dWzfR/bWXMJlMvNntQT92j9de5qeQRQC069KDGeN+YNb4seQp4M8Lr7cDYMG0SaxctABlNjPth1GETA5m1K/z2L5+LWO+7mPplhJo3KIVefL78/eJCIfl3XL8NCXy5uTTF+qglGLFvmO2tJ7NajN62QZyZ86If/YsxMWb6dnMEpiOnbvEsr1HnfmaH4uA0Om458oJIhRbv4yb4Wu4NGTkv87X29OTTi1a8v7ggSgFQaVLU69KFTp8049KJUrRo01bRs+awYfDv8MkgqeHB0N6PHhfaM7KFXh5ehFUpswj+/b09ubN9u/x2QfdUEpRoVJlatWuy0ddOlGuQiAdu3YnfOliRgweiNlsJnTOLBaGzmPFxq3/WrdG828Qx/2gqexEpDEwHEuX0X2gKzAC6KWU2mkNFqOBZkAU0Bq4AawFMmPpmJ2hlBoqIv2BfEBRoCAwTCll3ymciD0R51yzBnsi0nNrz+3JBIa05qnd2rNwkXTznT+r892NqcxjvcV169Ytp/8eM2bMaMg3xYyu0SUtBqXUCixrfSSkboL0f1oC/0tkUyuZLI8rpTonk6bRaDSaf8FT9+azRqNJX+67eaR3EdIco2s0XGBQSvVP7zJoNJrHxwW91+mO0TUaLjBoNBpjE2+d7v1fxugadWDQaDQuxRUTXtIbo2vUgUGj0biU1Fxj60nF6Bp1YNBoNC7F4M9MpzC6Rh0YNBqNSzF6N4szGF2jDgwajcalJLdo438Jo2vUgUGj0bgUo8/YcQaja9SBQaPRuBSz2di1aWcwusanIjDcu596W1A+CgVzZE0XvwCnL11N2SgNSM/1is5+2DvdfOdZHppuvo2GwbvfncLoGp+KwKDRaJ4cjD4w6wxG16gDg0ajcSlGH5h1BqNr1IFBo9G4FKPXpp3B6Bp1YNBoNC4l3uADs85gdI06MGg0Gpdi9Nq0Mxhdow4MGo3GpRj9oekMRteoA4NGo3EpRl9gzhmMrlEHBo1G41KM/tB0BqNr1IFBo9G4FKMvF+EMRteoA4NGo3EpBq9MO4XRNT51gWHR7JksmjMDlKJ0hYr0GjjELj06+g59unYi6vo13D08+HTAt5QoU44LkZF807M70XfuUKBQAIPHTbDdM/3nH1m/YjkCeGXIQN9h35PX3z/ZMsyeOplZUyejlCKwchUGjfrevoyh85j881hi792jReu2dP7gI1ta6xcaczMqisxZszJn8fJH1m82m5k8qC9Xz0UiJhPN3u5I2erPJrGb/t03nDt1EqXMfBk8w3Z96bQJ7N+8AXcPy3+dkpWr0bxDF6d8By+YT/ACy9IRVcuUZdyXfezSv/hxDL/v2I4A7u7uDO7+AXUrBzF40kTC1q+12d27f5+hH3xMk5o1H1W+QwrNnoRH/nyouDhO1m+eKnn+w8SJE5k4cSJKKapWrcqPP/5ol/7JJ5+wdetWRAQvLy++//57KlSowIoVKxg0aBBxcXGICK+++iq9evVK1bKlF0YfmHUGo2s0pXcBHhURCRCRg49zb1xsLAtn/8pnA4cybs4Cjh7cz+5tW+1spvwwGu8MGZgcFs5zDRvzyzBL4PDx9aHV2x2oXqeenX1sbCy/L13MoB/HM3HhMnLlycuUn+wf9IntZ06ZxKBRY5i3fBUH9u5m66aNdjZFiz9Dr779yJMvX5L7W73xFm3ebv848gHYvHQht29c57NxU2jU9m3CZ0x2aFe+Zm3a9xngMC1nvgJ8NnYKn42d4nRQiI2LY/z8UH75si8bJk5h55HDrNu1086mZ7s32T59Jtumz6Rtoyb0/+VnAPq815Ft1uuDu3+Im8mUakEB4EboIi4MSP01nmJjY5kwYQI//fQTa9euZdeuXWzYsMHOpnLlyixfvpwtW7ZQo0YNvvrqKwAyZcrE8OHD2bp1K5MnT2bu3LmcP38+1cuYHpiVcvpICRFpIiLHROSEiHzhIN1LROZa07eJSEAaSEqC0TU+sYFBRFK9NbNpzSp8fHwpVb4C3j4+lChbjjVLFtnZHN67h8YvvwpA6w6duXb1CmazmUxZsvL8iy/h6ellZ2+29iXeunUTs9lMzN27ZMuRM9kyrAlfho+vHxUqVcbHx4eygRVZssB+AbayFQJ5rt7ziEiS+19r9yaZMmd6LP0AR3b8Qakq1TGZTAQ+V4/4+Dgu/n06iV3gc/XIXTDgsf0kZvH6dfj5ZCCoTBl8vL2pXLIUc1eE29nkzp7d9vnW3WhIKp/pSxdTtlixVCsXQNT8MOIuXkrVPAGWLFmCn58fQUFB+Pj4UKlSJebNm2dn88Ybb5AlSxYAateuza1btwCoUaMG1atXB6BkyZJ4eHgQERGR6mVMD5RSTh8PQ0TcgLFAU6A00FZESicyew+4rpQqBowGvksDSUkwusY0DQzW2v1REZkmIvtFJFREfESkn4jsEJGDIhIs1iegiKwTkW9FZD3wkYjkFpHfRGSf9finmugmIhNE5JCIrBSRDM6U53zkGfwyPXio5sidhxtX7VchjYm5S8GiRQHw9PTEZDJxMTIy2Ty9vb1p/MqrDP7sYzq81ITrV6/Q8aNPkrU/ExFBpsyZbee58+Tl2pUrzhQ/Vbh75zY58jxoiXh6eXEp8swj5XHl/FmGdWvPj70/IPLUCafuOX3+HFkyZrSd58uVi8s3riex+2zMKCq/0ZaQ1asY0uPDJOmHTp7knRdTt7snrYiIiCBzgt86X758XL58OVn7KVOmUKZMmSTXw8LCUEoRFBSUJuV0NUo5f6RAVeCEUuqUUioWmAO8nMjmZWCa9XMo8Lw4qnGlMkbX6IoWQwkgWClVHrgJdAN+UkpVUUqVBTIALyawz6KUqqOUGgn8AKxXSlUAKgGHrDbFgbFKqTLADaClMwVxGJ2d+P7ElPzXFBMTw4ZVK/j822FMDgsne85cjOzf92GFcOAgxSKkHo/5HfxD9SYv8vHo8fQeN4WCxUsy18llth199+JA+PCPPmHXzNm0er4hgydNsEv7be3vuLmZqF+lmtPlTU8cak7mux4yZAjnzp1j+PDhdtePHTvGt99+y+eff467+39jSDDebHb6EJHOIrIzwdE5QVb5gYS1mrPWaziyUUrFAVFAdtIYo2t0RWA4o5TabP08A6gF1LP2hR0A6gMJq0lzE3yuD/wMoJSKV0pFWa//pZTaa/28CwhI7DThl71wziwA8hUoyO2bN202Vy5eIEu2bHb3eXtn4O+TJwFLH7HZbCZX3rzJituxydJnXLpCRUwmE3WaNOPMX6eStfcPCOBmVJTt/OKF82TLniNZ+9Tgt/E/Mrx7e4Z3b4+3rx9XLpyzpcXeu0eufAWczit77rx4+/gA8EL797kXHe3UfYXz5eeGtZsE4NylS+SwdqE4ovfb73D2kn33ztyV4VQqWdLpsqY3AQEBRCX4rc+dO0eOHEl/62nTphEWFsbUqVPx8/OzXb948SIdOnTg9ddf55VXXnFJmV3Bo/S/K6WClVJBCY7gBFk5irKJo7EzNqmO0TW6IjAkLqACxgGtlFLlgAmAd4L0O07keS/B53gczK5K+GW/0qYdADWfb0h09B2OHthPTHQ0xw4eoH4z+26J0oGBrFi0AIC5k4PJmj07poe0GPwDArh75w6RZyz9vzs2biB7rlzJ2j/fqAnRd25zYO8eoqOjObh3Dy++4lSD57Fp8f4HtsHi0lWqc2THH5jNZvZuXIubm/sjjSUkHI/YsCgEDy+v5I0T8MJztbkdHc2uI4eJjolh19EjvN6wsZ3N1v37bJ/HLwjF29PTdh4XF8fxiAg6Wsd/jECzZs24ffs2u3fvJjo6mt27d9OqVSs7m/DwcMaNG8fo0aMpau3CBIiOjqZNmzbUqFGDTz5JvmvSiKTiwOxZIOH0vwLAueRsrOOWmYFrqSQlWYyu0RVt04IiUkMptRVoC2wCagJXRMQPaIWlX8wRa4CuwPfWQRjff1MQT09PXmrdjmF9eqOAUuXKU7nGswzu3ZMSZcrR6p0OvNujJ326daLDS01wc/fgk/6DbPe3b97YNtj8bvPGfNi3P5Wq1aBa7br8r0cXTCJk8PWlz7DkZyV5envT5p32fPnxB6AU5SpWokbt2vTq3oVyFSrwTueurFq2lO+HDsZsNvPbvDksWRBK2FrLzKWWjZ7n7t1olFK8ULsmb73XiTbvOD9L6dkXW3B09w6Gd3sXMZlo+tZ7trTh3dvz2dgpAEz9th8XIv4CYOj7b1GgWAne/Kwv4TOncvHMaUQEN3cPWnT6wCm/3p6edGrRkvcHD0QpCCpdmnpVqtDhm35UKlGKHm3aMnrWDD4c/h0mETw9PBjS48E03TkrV+Dl6UWQgz74f0tA6HTcc+UEEYqtX8bN8DVcGjLyX+fr7e3Ne++9R7du3VBKUblyZerWrUunTp0IDAyke/fuDBs2DLPZzKeffgqAr68vK1euZMyYMdy6dYutW7fy7LOW6cT9+vWjcePGD3NpCFJxKucOoLiIFAYigTZAu0Q2YcA7wFYsz5rflQvmkhpdo6Tld2SdNrUM2IAlGPwJvAV8hUXgaSx9YxFKqf4isg7opZTaab0/NxAMFMHSMugKnAeWWMcnEJFegJ9Sqn9y5fjjxN/pMqk4V+aMKRulERsOOzconNq0zuiZslEa8bRu7ZkxY7r9P3us0bHfD51w+u+xfpliD/UhIs2A7wE3YLJSarCIDAB2KqXCRMQb+BWoiKUW3UYplXwmUk8AABt/SURBVHxfbyphdI2uaDGYlVKJJ7v3tR52KKXqJjq/SNIReICyCWxGpEIZNRqNi0jNrQqUUsuwVD4TXuuX4HMM8FrqeXQOo2v8b0xz0Gg0huGf7tj/MkbXmKaBQSl1mgS1e41GozH6fsjOYHSNusWg0WhcitHXEXIGo2vUgUGj0bgUg2+H7BRG16gDg0ajcSlmoz81ncDoGnVg0Gg0LsXoA7POYHSNOjBoNBqXYvSBWWcwukYdGDQajUsx+sCsMxhdow4MGo3GpRj8mekURteoA4NGo3EpzuxaZnSMrvGpCAxfzAhLF7+ru7dJF78AzYPKpYvfa7H308UvpO96RReatkrZKI3IuGlFuvl+HIzezeIMRtf4VAQGjUbz5BBv8Bk7zmB0jTowaDQal2L0bhZnMLpGHRg0Go1LMXo3izMYXaMODBqNxqUY/KVgpzC6Rh0YNBqNSzF6bdoZjK5RBwaNRuNSjP7QdAaja9SBQaPRuJR4o/ezOIHRNerAoNFoXIrRa9POYHSNOjBoNBqXYvSpnM5gdI06MGg0Gpdi9Nq0Mxhdow4MGo3GpRi9Nu0MRtf41AeGiV1aE5AzG/FmxbCw3/l/e3ceJkV1r3H8+7IIAdzRICoOgoIi27AoioCIwjXBa4wafYgriiCJGCQXo4n7CirgBgKyuCGIRhYRREBcwIXFXZQYRAVE3JAdYX73j6qBnqGBRqaqu5zf53n6me7q6n7rzPT0qTqn6pxp73+6zToTr7mMCuXKbfljdxk8msUrfmBwl3M47ID9AChTRpQrU4aTbnpwp5mDRoxg4IjhYMaxTZoy+N57izzfvXdvXn/rTSRRsUIFHurTh8b1GwBQr+UJVNhjDwCqVK7MKxMm7lJ5hw8ZzPAhD2MGTZs3p9+DA4s8P2bUkwy8rz8bNmzg3D+fz5U9e215rvP5nfjPp59gQF5eTUaMGk2ZMmUyzn5s2FAeGzYUMyO/WXPu6n9/keefHfMUQx68nw0bNnD2eZ3o1uNvRZ5f8c1yzj2jI3k1a/LIE6Mzzh06dChDhwa5zZs35/77i+b27NmT2bNnI4kKFSrQv39/GjZsyJQpU7j11lvZtGkTkjjzzDPp1avXdlJ23WGjHqH8wdWxTZv4rG3HEnvfXJf04SIykfQyZv5fHQFJV0r6WNIT2ci/oHUzqu5ZhXa3DGTApJn06thmu+v2HT+d9rcNov1tg1i84gcAugwes2XZ6wsW8dX3P+40c+PGjTw0fBhD7u3HrBcm8/Y785n+2mtF1mnWuBEz/v0c86fP4ITmx9LrhhuKPD9v+gzmTZ+xy5XCxo0bGTZ4EP0fGsSUma8yf+4cXp35cpF16tSty79uvo3qBx9cZPlzY59m8aJFTH11Fi+9OoslX33FmFGZ/9k2btzIo48Moc+ABxg3dQbvzpvL66++UmSdI46swzXX38RB1Q9O+x7XXv03DjzwwIwzC3OHDBnCAw88wIwZM5g7dy6vvFI0t0mTJrzwwgvMmjWLFi1acO211wKw11570bdvX2bPns2wYcMYPXo0y5Yt26X8Hflx7Di+vrlPib1fUphlfkuqpJcxqxUDcAVwmpl12tmKkkr86KbtMUcw7YPgCGHS/I8pV6YstX67/y96r+a1azBp3sc7XW/c5BfYs3JlmufnU7lSJZo2asSoZ58pss5F557HvvvsA8BJLVuyavXqX7RNxU2aMJ7KVaqQ37QZlSpVolF+E54Z/VSRdRo2zqftKacgFftoSGzevJn169ayds0azAo49NAaGWe/OGkilatUoVGTplSqVIkGjfMZN3ZMkXXqN2pM65PbIWmb10+dPIlVP60kv9mxmRcYmDhxIlWqVKFp0yA3Pz+fMWOK5nbq1Il9wt93q1atWLVqFQAtWrTguOOOA6Bu3bqUL1+exYsX71L+jqx8Zjybln9TYu+XFAVmGd+SKullzFrFIGkQcDgwXlJvSbMkzQ9/1gnXuUjS05ImAC+Gy/4u6W1J70m6aXe2Ye/fVOSLb3/Y8njdzz9T67dV067bq+NJTLmuK/dc8L/bPFfvkGpULF+Op2e/s9PMRYu/YJ+9997y+OBqB7Hi22+3u/6Qxx6l/lFHFVnW5OS2NDulHf0GDdzOq9L7YvHnW74AAQ6qXp1vv12R0WvP+ONZ1Kx1OB1Oas1pJ7fhiCPrcEKr1hlnf7l4MXullLvaQdX5LsPsTZs2cd/dfbj+tjszziu0ePFi9k7JrV69OitWbD93+PDh1KtXb5vl48ePx8xo2rTpLm+DK8rMMr4lVdLLmLWKwcy6AkuBk4CBQCszawxcD9yesmoL4EIzayvpVOAIoDnQCGgiqVW695fURdIcSXOWzn094+1KV4P3HPkcp942iPMGPErtalXp+fs2RZ4/v3UzFn3zPZsyaFdM90FIt4cMcHPfvixZtowBt239dTw9bBhzp03nwbv6MPzJJ3lj7pydZqaEb5tN+uzi5rz5Bl8vW8aEF6cx4cVp/GfhQsaMenIXotP9A2SWfev111GvfgOOPmbX55jYld/3HXfcwdKlS+nbt2+R5Z988gm33347vXv3ply5Ut8tt9uS/qWZiaSXMdtNSYX2Bp6W9AHQD0jdZZtqZt+H908Nb/OBeUBdgopiG2Y22MyamlnT6k1O2LL8+rPaM+W6rky5riur1m2gRtV9tzz3m/LlWfTNd9u816fLgj3M71ev5bUF/6XeodWKPN8orzr/fvv9jAp6eN5h/Lhy5ZbHS75eRtX9t22+euTxx3n2+YmMGjyEKlWqbFl+9JF1AGien0/1gw7ilVmzM8oFqJFXkx9/3NoPsmzpUvavmv4IqbhRTzxOrdpHsH/VquxftSpH1avHm7Myr3Br5OXxU0q5v16WefYnH33EnDffoN3xxzJ54ngWffYZ3TtflNFr8/LyWJmSu3TpUqqmyR05ciTjx49nxIgRRX7fy5cv55JLLuGcc87hjDPOyCjT7VjSm1kykfQy5krFcAsww8yOAToCFVOeW5NyX8AdZtYovNU2s0d2JejmsVO2dBhP++BTTj7mSABOa3wUmwoK+Gx50YqhfNkyHLJ/0BSxR7lyNK99GItS1mlWqwbly5Zl4twPM8rveGp7Vq1Zw5x35rNm7VrmvPMO553xhyLrPD91KgOGDOahPn2pXbPmluVfLlnC6rC/YdHixSxbvpymjRtnXPYOv/s9a1avZv7cOaxdu5Z35s3lzHPOyei1hxx6KJ9+soD169ezft06Fn7yCXXqHrXzF4ZO7XAaa1av5t3581i7di3vzZ/H6X/MbNazUc9N4KVZb/HSrDfp8PvTqVmrFg8+MiKj15522mmsXr2aefOC3Hnz5nHWWUVzJ0+ezEMPPUS/fv2oVavWluVr167l3HPPpUWLFvTs2TPjsrod22yW8S2pkl7GXDku3htYEt6/aAfrTQFukfSEma2WdDDws5n9oh68kTPfpvXRtXjpX90oKDDunjBja9B1XWl/2yAqVdiDoZefiyQkWLziB+58btqW9Tqd2IQFSzKPr1ixIl0vvJDOV10VnD7ZuDFtW7Xigu5X0KRhQ3p0uZxb772HgoIC/nJNb2Draamz3n6bOwf0BwnM+N0pp9C2Zctdyr7o0i706HY5ZkbjJs1o1aYt3TpfTKP8fC7v/leenzCOO266kYKCAkY/8TjPjH6KmW/OoXuPv/HW7Nmc2rolAmrWqkWX7n/JOHuPihX588Wd+ftfr8DMaJjfhJat2tCj62XUb9iIS7t1Z/LzE7j7tlsoKChg7FNP8tzYMUx5NfMjou2VuXPnzlxxRZDbpEkT2rRpw2WXXUajRo3o3r07ffr0oaCggKuvvhqAypUr8+KLLzJgwABWrVrF7NmzOeGE4Kjz+uuvp3379ru1TYXyxj5KuQMPAInaMyfx0+RpfHPHPSXy3rksV5tPSlLSy6hsFkDS50BTguagkcAKYDpwvpnlSboIaGpmf0l5TQ/g0vDhauDPZvbZjnLa3PhAVgqZzTmfV1aqsvOVIrA+i3M+75XF3Zxszvl8RPbmfM6sk6iYO56blvH/4z/OOPkXZWRb0suY1SMGM8sL734LHJny1L/C50cAI4q9ZgAwIPqtc85FIel705lIehlzpY/BOVdKxNUxK2k/SVMlLQx/7ptmncMkzZX0jqQPJXXdrdBQ0svoFYNzLla2C7fddA0wzcyOAKaFj4tbBhxvZo2AY4FrJFXf3eCklzFXOp+dc6VEjOMI/S/QJrw/EngZ6J26gpltTHlYgRLaWU56Gf2IwTkXqxgv/vqtmS0LM5cBaQfaknSopPeAL4G7zGzp7gYnvYx+xOCci9WutKtL6gJ0SVk02MwGpzz/ElBtmxfCdZlmmNmXQIOweeU5SWPNbHnGG5lG0svoFYNzLla7spMcfkEO3sHz7bb3nKTlkg4ys2WSDgJ2eMGRmS2V9CFwIjA2861M9167sm7uldGbkpxzsYqxmWU8cGF4/0JgXPEVJB0i6Tfh/X2BE4BPdjc46WX0IwbnXKwyGWyyhNwJjJHUGfgCOBtAUlOgq5ldChwF3CPJCC7Yu9vMMhv4bAeSXkavGJxzsYrr4i8z+w44Oc3yOYSjJ5jZVKBBBNkl/Zbby4mkjF4xOJdwC1uWzNhNu+qXDsWR9KuCM5H0MpaKiuHlGzMf7O3X4pfNQ1cCKlfc+Tq/QntmabyibFUKu6Mg2d+ZGUl6GUtFxeCcyx1J35vORNLL6BWDcy5WSf/SzETSy+gVg3MuVjEOF5E1SS+jVwzOuVglvf09E0kvo1cMzrlYFViy96YzkfQyesXgnItVwpvfM5L0MnrF4JyLVdI7ZjOR9DJ6xeCci1XSO2YzkfQyesXgnItV0vemM5H0MnrF4JyLVdLP2MlE0svoFYNzpdBhox6h/MHVsU2b+Kxtx1izk743nYmklzG2+Rgk5Un64Be+to2kiSW9Tc6VVj+OHcfXN/fJSnYBlvEtqZJeRj9icK4UWvnMeCrWr5eV7KTvTWci6WWMewa3cpJGSnpP0lhJlSSdLGm+pPclDZNUAUBSB0kLJL0GnBkuKyNpoaQDUh7/R1LVmMvhnPuFNm8uyPiWVEkvY9wVQx2Cia4bAD8BPYERwJ/MrD7BEUw3SRWBIUBHgrlJqwGYWQHwONApfL92wLtm9m3xIEldJM2RNGfw4O1Op+qci1mBZX5LqqSXMe6mpC/N7PXw/uPAv4BFZvZpuGwk0B14OVy+EEDS40CXcJ1hBPOa9gcuAYanCyo2wXaO/vqdK32S3sySiaSXMe6KYVd+W2nXNbMvJS2X1BY4lq1HD865BLBSsJ+W9DLG3ZRUQ1KL8P55wEtAnqTa4bLzgZnAAqCmpFop66YaSnDEMcbMNke8zc796uSNfZRDHrwblS9P7ZmTOPAfV8eWXWCW8S2pkl7GuI8YPgYulPQwsBDoAbwBPC2pHPA2MMjMNkjqAjwv6VvgNeCYlPcZT9CElLYZyTm3Y5+fdUHWspPezJKJpJcxtorBzD4Hjk7z1DSgcZr1JwN1t/N2DQk6nReU2AY652KxOVd7XEtQ0suYuOsYJF0DdMP7FpxLpKTvTWci6WVMXMVgZncCd2Z7O5xzv0yutquXpKSXMXEVg3Mu2ZL+pZmJpJfRKwbnXKyS3sySiaSX0SsG51yskt4xm4mkl9ErBudcrJK+N52JpJfRKwbnXKyS3v6eiaSX0SsG51yskr43nYmkl9ErBudcrBL+nZmRpJfRKwbnXKyS3sySiaSXUUk/5ImapC7hEN6e/SvOzWZ2aSyzy21xj66aRF12vopn/wpys5ldGsvscphXDM4554rwisE551wRXjHsXDbbX0tjtpe59GS7HOWdz84554rwIwbnnHNFeMXgnHOuCK8YnHPOFeEVQxqSzpa0Z3j/n5KelZSf7e2Kg6TKWcqtJul0SR0lVcvGNsRNUktJF4f3D5BUM6bc/eLIccnlnc9pSHrPzBpIagncAdwNXGtmx0aYuQrY7h/DzPaKKjvMPx4YClQxsxqSGgKXm9kVUeaG2ZcC1wPTAQGtgZvNbFgM2RWAPwJ5pAwRY2Y3R5x7A9AUqGNmR0qqDjxtZidEmRtmLwTeAYYDL5h/CbhifKyk9DaHP38HDDSzcZJujDLQzAqPUG4GvgYeI/iS7ATsGWV2qB/QHhgfbs+7klrFkAvwd6CxmX0HIGl/YBYQecUAjANWAnOBDTHkFfoD0BiYB2BmSwuPUmNwJNAOuAS4X9JoYISZfRpTvstxXjGkt0TSwwT/PHeFe5VxNbu1L3ZkMlDSm0CfqIPN7EtJqYs2b2/dEvYVsCrl8Srgy5iyDzGzDjFlpdpoZibJIN4mvPAIYSowVdJJwOPAFZLeBa4xs9lxbYvLTd7HkN45wBSgg5n9COxHsFcbh82SOkkqK6mMpE7E8wX9ZdicZJL2kNQL+DiGXIAlwJuSbgybWN4A/iOpp6SeEWfPklQ/4ox0xoQ7H/tIugx4CRgSR7Ck/SX1kDQH6AX8FagKXA08Gcc2uNzmfQxpSKoFfGVmGyS1ARoAj4aVRNTZecAA4ASCPofXgavM7POIc6uGue0ImrBeBHoUNu9EnH3Djp43s5siyHyf4PdbDjgC+C9BU5KCSGtQ0plptuEU4NQwc4qZTY06M8z9lKCpcriZfVXsud5mdlcc2+Fyl1cMaUh6h6BjMI/gyGE8QSfhadncrtJAUhmCDvCfIs45bEfPm9niCLPLElQE7aLK2Em+vMPZ7Yj3MaRXYGabJJ0J9Dez+yXNjyNY0pHAQOC3ZnaMpAbA6WZ2a8S596VZvBKYY2bjIs5+EuhK0GQ2F9hb0r1m1jeqzMIvfknHAR+a2arw8Z7A0UBkFYOZbZa0VtLeZrYyqpziJE0gPPOtWF9S4XadHte2uNzmfQzp/SzpPOACYGK4rHxM2UOAfwA/A5jZe8C5MeRWBBoBC8NbA4K+lc6S+kecfXR4hHAGMAmoAZwfcWahgcDqlMdrwmVRWw+8L+kRSfcV3iLOvBu4B1gErCP4rA0hKP8HEWe7BPEjhvQuJtiDvc3MFoUXHj0eU3YlM3ur2B7dphhyawNtzWwTgKSBBP0MpwDvR5xdXlJ5gorhATP7ufBsnRgUaVYxswJJcfxfPB/eYmNmMwEk3WJmqaciT5D0Spzb4nKbVwxpmNlHwJUpjxcBd8YU/23Y+V14yH8WsCyG3IOBygTNR4T3q4fNHlGf3/8w8DnwLvBK2P4faR9Div9KupKtRwlXEHRER8rMRkadsQMHSDrczP4LEO74HJDF7XE5xjuf05B0BMEVz0cTNLEAYGaHx5B9OMEY+ccDPxAc9neKsjM0zO0M/BN4meAsmVbA7cAo4EYzi+t03cLtKVd49BJxzoHAfUBbgsp4GsFZYN9EnJvNz1gHgs9YYQWYB3QxsxejznbJ4BVDGpJeA24guBq4I0HTksxsh6dVllB22XAvvTJQprBTNA7hsAznAwsIjhi+MrPImxjCK51vAFoSfDm/RjAkRqSnyoZnB11pZv2izNlOdtY+Y2F+BaBu+HCBmcV51bfLcV4xpCFprpk1kfS+mdUPl71qZifGkP0FMBkYDUyP67TCcLyiHsAhBOPoHAfMNrO2MWRPBV5haz9OJ6BNHKdzSnrZzNpEnZMmN5ufsfJAN4KjQgiOEh82s5+jznbJ4Gclpbc+PJ9+oaS/SPoDcGBM2XUIroLtDiyS9EA4mF/UegDNgMVmdhLBOD4rYsgF2M/MbjGzReHtVmCfmLJfD3/HJ0rKL7zFkJvNz9hAoAnwUHhrQjxnYrmE8M7n9K4CKhF0QN9C0P58YRzBZrYOGEMwZMK+BFcjzwTKRhy93szWS0JSBTNbIKlOxJmFZkg6l6DcAGcR3xk7x4c/U0dTNYK/eYmT9JiZnU8weF9WPmNAMzNrmPJ4ejhOknOANyXlJEmtgT8B/wO8DYw2s2cizvw3QTv3VQRfUj8A5aO82ltbhxoXQZ9G4ZhQZYHVUQ81ng2SPiL4u44H2hCUfQsz+z6GbZgHnG1mn4WPDwfGmlmpmHPE7ZxXDClSrwxNJ44rQyUtImjjHwOMN7M1UWem2YbWwN7AZDPbGHd+3CT9DqhH0bODIpmPITw1thtwOMHggWJr5WgxnZV0MsFcDKlnJV1sZjOiznbJ4BVDivALcbsKLxCKeBv2inqcoFwhqW7YZJV2T9XM5sWwDYMImnROIpio6CzgLTPrHHHuQDPrFmXGDrIrEoykenK4aCrQz8zWZ2N7XO7xiiGN8FTRdWZWED4uC1Qws7URZv6fmfWRdD9pjlrM7Mo0L0s0SYPNrIuk1D3V1KuQ4zgjqnC2vsKfVYBnzezUqLOzRdIYggsInwgXnQfsa2ZnZ2+rXC7xzuf0phEMP104hs5vCIaHOH67r9h9hXMfzIkwI6eYWZfw7kCCZqufJP0LyCfokI3DuvDn2vA6ju+AWOZezqI6xTqfZ3jns0vlFUN6Fc1sy8BqZrZaUqUoA81sQnj3PTOLZSTXHPJPMxsTnpZ7CsFAbwOByObYTjFR0j4EM+TNDZcNjSE3m+ZLOs7M3gCQdCzBvB/OAV4xbM8aSfmFbdySmrJ1zzJq90o6CHgaeMrMPowpN5tS59geZDHMsZ3iboLO4BOB2cCr/ErP6dfWyYnKAxeEF1MacBjwUTa3zeUW72NIQ1Iz4ClgKcE/TnXgT2Y2d4cvLLn8agTTi/4J2IvgdNVI52PIJkkTCc7QaUdwsdU6gg7ghjt8YclkjyGYY7rwquvzgH3M7Jyos+OmLE5O5JLFK4Y0wrM2/gq0J+ikmw3cH/dZGwrmIv4/gkppjziz4xQ203UA3jezheERU/04BnWT9G7xCijdMudKE68Y0sjmWRuSjiI4UjiLoCP0KeCZqEf7LK0kjSBovkptb7/QzK7I6oY5l0VeMaSRzb1ISW8QDHX9tJktjTqvtJP0McH4VF+Ei2oQnCFWQHDBWYNsbZtz2eKdz+ll5ayN8HqJz8xsQNRZbosO2d4A53KNHzGkkc29SEmTgdNLw1AUzrnc5EcM6WVzL3IxwVDQ4wkmpgfAzO7N3iY550oTrxjSyPJpe0vDWxlgzyxuh3OulPKmJOecc0X4EUOOCQeUSzeIXuQDyjnnHHjFkIt6pdyvCPwR2JSlbXHOlULelJQAkmaa2Q7ninDOuZLiRww5RtJ+KQ/LAE2BalnaHOdcKeQVQ+6Zy9apHn8GPgcinU3MOedSlcn2Brht9AYamVlN4DGCaxkimznOOeeK84oh9/wznMmscNKaEfxK5wdwzuUmrxhyzzaT1gC/2iG3nXO5xyuG3LNE0sMEE/VMklQB/zs552Lkp6vmmGxOWuOcc+AVg3POuWK8icI551wRXjE455wrwisG55xzRXjF4JxzrgivGJxzzhXx/2L9mkVn4l0aAAAAAElFTkSuQmCC\n",
      "text/plain": [
       "<Figure size 432x288 with 3 Axes>"
      ]
     },
     "metadata": {
      "needs_background": "light"
     },
     "output_type": "display_data"
    }
   ],
   "source": [
    "df.plot.correlation(\"*\")\n",
    "df.plot.correlation(\"*\", output_format=\"image\", output_path=\"images/correlation.png\")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Using other plotting libraries"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Optimus has a tiny API so you can use any plotting library. For example, you can use ```df.cols.scatter()```, ```df.cols.frequency()```, ```df.cols.boxplot()``` or ```df.cols.hist()``` to output a JSON that you can process to adapt the data to any plotting library."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Outliers"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Get the ouliers using tukey"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 37,
   "metadata": {},
   "outputs": [
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "INFO:optimus:Using 'column_exp' to process column 'age' with function _cast_to\n",
      "INFO:optimus:percentile() executed in 4.02 sec\n"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Loading page (1/2)\n",
      "Rendering (2/2)                                                    \n",
      "Done                                                               \n"
     ]
    },
    {
     "data": {
      "text/html": [
       "<img src='images/table5.png'>"
      ],
      "text/plain": [
       "<IPython.core.display.HTML object>"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    }
   ],
   "source": [
    "df.outliers.tukey(\"age\").select().table_image(\"images/table5.png\")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Remove the outliers using tukey"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 38,
   "metadata": {},
   "outputs": [
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "INFO:optimus:Using 'column_exp' to process column 'age' with function _cast_to\n",
      "INFO:optimus:percentile() executed in 4.1 sec\n"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Loading page (1/2)\n",
      "Rendering (2/2)                                                    \n",
      "Done                                                               \n"
     ]
    },
    {
     "data": {
      "text/html": [
       "<img src='images/table6.png'>"
      ],
      "text/plain": [
       "<IPython.core.display.HTML object>"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    }
   ],
   "source": [
    "df.outliers.tukey(\"age\").drop().table_image(\"images/table6.png\")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 39,
   "metadata": {},
   "outputs": [
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "INFO:optimus:Using 'column_exp' to process column 'age' with function _cast_to\n",
      "INFO:optimus:percentile() executed in 3.94 sec\n",
      "INFO:optimus:Using 'column_exp' to process column 'age' with function _cast_to\n",
      "INFO:optimus:percentile() executed in 4.09 sec\n"
     ]
    },
    {
     "data": {
      "text/plain": [
       "{'count_outliers': 1045,\n",
       " 'count_non_outliers': 1036,\n",
       " 'lower_bound': -6.0,\n",
       " 'upper_bound': 66.0,\n",
       " 'iqr1': 21.0,\n",
       " 'iqr3': 39.0}"
      ]
     },
     "execution_count": 39,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "df.outliers.tukey(\"age\").info()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### You can also use z_score, modified_z_score or mad\n",
    "\n",
    "\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 40,
   "metadata": {},
   "outputs": [
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "INFO:optimus:Using 'column_exp' to process column 'age' with function _z_score\n",
      "INFO:optimus:Using 'column_exp' to process column 'age' with function _cast_to\n",
      "INFO:optimus:Using 'column_exp' to process column 'age' with function _cast_to\n",
      "INFO:optimus:Using 'column_exp' to process column 'age' with function _cast_to\n",
      "INFO:optimus:Using 'column_exp' to process column 'age' with function _cast_to\n"
     ]
    },
    {
     "data": {
      "text/plain": [
       "DataFrame[pclass: bigint, survived: bigint, name: string, sex: string, age: double, sibsp: bigint, parch: bigint, ticket: string, fare: double, cabin: string, embarked: string, boat: string, body: double, home_dest: string]"
      ]
     },
     "execution_count": 40,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "df.outliers.z_score(\"age\", threshold=2).drop()\n",
    "df.outliers.modified_z_score(\"age\", threshold = 2).drop()\n",
    "df.outliers.mad(\"age\", threshold = 2).drop()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Database connection\n",
    "Optimus have handy tools to connect to databases and extract informacion. Optimus can handle **redshift**, **postgres**, **oracle** and **mysql**"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 41,
   "metadata": {},
   "outputs": [
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "INFO:optimus:Operative System:Windows\n",
      "INFO:optimus:Just check that Spark and all necessary environments vars are present...\n",
      "INFO:optimus:-----\n",
      "INFO:optimus:SPARK_HOME=C:\\opt\\spark\\spark-2.3.1-bin-hadoop2.7\n",
      "INFO:optimus:HADOOP_HOME=C:\\opt\\hadoop-2.7.7\n",
      "INFO:optimus:PYSPARK_PYTHON=C:\\Users\\argenisleon\\Anaconda3\\python.exe\n",
      "INFO:optimus:PYSPARK_DRIVER_PYTHON=jupyter\n",
      "INFO:optimus:PYSPARK_SUBMIT_ARGS=--jars \"file:///C:/Users/argenisleon/Documents/Optimus/optimus/jars/RedshiftJDBC42-1.2.16.1027.jar,file:///C:/Users/argenisleon/Documents/Optimus/optimus/jars/mysql-connector-java-8.0.16.jar,file:///C:/Users/argenisleon/Documents/Optimus/optimus/jars/ojdbc8.jar,file:///C:/Users/argenisleon/Documents/Optimus/optimus/jars/postgresql-42.2.5.jar,file:///C:/Users/argenisleon/Documents/Optimus/optimus/jars/RedshiftJDBC42-1.2.16.1027.jar,file:///C:/Users/argenisleon/Documents/Optimus/optimus/jars/mysql-connector-java-8.0.16.jar,file:///C:/Users/argenisleon/Documents/Optimus/optimus/jars/ojdbc8.jar,file:///C:/Users/argenisleon/Documents/Optimus/optimus/jars/postgresql-42.2.5.jar\" --driver-class-path \"C:/Users/argenisleon/Documents/Optimus/optimus/jars/RedshiftJDBC42-1.2.16.1027.jar;C:/Users/argenisleon/Documents/Optimus/optimus/jars/mysql-connector-java-8.0.16.jar;C:/Users/argenisleon/Documents/Optimus/optimus/jars/ojdbc8.jar;C:/Users/argenisleon/Documents/Optimus/optimus/jars/postgresql-42.2.5.jar;C:/Users/argenisleon/Documents/Optimus/optimus/jars/RedshiftJDBC42-1.2.16.1027.jar;C:/Users/argenisleon/Documents/Optimus/optimus/jars/mysql-connector-java-8.0.16.jar;C:/Users/argenisleon/Documents/Optimus/optimus/jars/ojdbc8.jar;C:/Users/argenisleon/Documents/Optimus/optimus/jars/postgresql-42.2.5.jar\" --conf \"spark.sql.catalogImplementation=hive\" pyspark-shell\n",
      "INFO:optimus:JAVA_HOME=C:\\java\n",
      "INFO:optimus:Pyarrow Installed\n",
      "INFO:optimus:-----\n",
      "INFO:optimus:Starting or getting SparkSession and SparkContext...\n",
      "INFO:optimus:Spark Version:2.3.1\n",
      "INFO:optimus:\n",
      "                             ____        __  _                     \n",
      "                            / __ \\____  / /_(_)___ ___  __  _______\n",
      "                           / / / / __ \\/ __/ / __ `__ \\/ / / / ___/\n",
      "                          / /_/ / /_/ / /_/ / / / / / / /_/ (__  ) \n",
      "                          \\____/ .___/\\__/_/_/ /_/ /_/\\__,_/____/  \n",
      "                              /_/                                  \n",
      "                              \n",
      "INFO:optimus:Transform and Roll out...\n",
      "INFO:optimus:Optimus successfully imported. Have fun :).\n",
      "INFO:optimus:Config.ini not found\n"
     ]
    },
    {
     "data": {
      "text/html": [
       "<style> /* Tables*/\n",
       "\n",
       " .data_type {\n",
       "        font-size: 0.8em;\n",
       "        font-weight: normal;\n",
       "    }\n",
       "\n",
       "    .column_name {\n",
       "        font-size: 1.2em;\n",
       "    }\n",
       "\n",
       "    .info_items {\n",
       "        margin: 10px 0;\n",
       "        font-size: 0.8em;\n",
       "    }\n",
       "\n",
       "    .optimus_table td {\n",
       "        border: 0px;\n",
       "    }\n",
       "\n",
       "    .optimus_table tr:nth-child(even) {\n",
       "        background-color: #f2f2f2 !important;\n",
       "    }\n",
       "\n",
       "    .optimus_table tr:nth-child(odd) {\n",
       "        background-color: #ffffff !important;\n",
       "    }\n",
       "\n",
       "    .optimus_table thead {\n",
       "        border-bottom: 1px solid black;\n",
       "    }\n",
       "    .optimus_table{\n",
       "        font-size: 12px;\n",
       "    }\n",
       "\n",
       "    .optimus_table tbody{\n",
       "        font-family: monospace;\n",
       "        border-bottom: 1px solid #cccccc;\n",
       "    }\n",
       "\n",
       "    /* Profiler */\n",
       "        .main{\n",
       "        width:100%;\n",
       "        overflow:auto;\n",
       "        border-bottom:1px solid #eeeeee;\n",
       "        padding: 10px 0;\n",
       "    }\n",
       "    .panel_profiler{\n",
       "        margin-right:2%;\n",
       "        float:left;\n",
       "        padding-bottom:2%;\n",
       "    }\n",
       "    .panel_profiler tbody{\n",
       "        font-family:monospace;\n",
       "    }\n",
       "    .title_profiler{\n",
       "        padding:20px;\n",
       "        background-color: #eeeeee\n",
       "    }\n",
       "    .info{\n",
       "        overflow: auto\n",
       "    }\n",
       "    .main td, main th{\n",
       "        padding:0em\n",
       "    }\n",
       "    .panel_profiler td {\n",
       "        padding:0.2em\n",
       "    }\n",
       "    .none, .true{\n",
       "        color:#0000ff\n",
       "    }\n",
       "    .optimus_table th {\n",
       "        font-family:sans-serif;\n",
       "    }\n",
       "\n",
       "    .info_items{\n",
       "        font-family:sans-serif;\n",
       "        font-size:10px;\n",
       "    }\n",
       "</style>"
      ],
      "text/plain": [
       "<IPython.core.display.HTML object>"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    }
   ],
   "source": [
    "import sys\n",
    "sys.path.append(\"..\")\n",
    "\n",
    "from optimus import Optimus\n",
    "op= Optimus(verbose=True)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 42,
   "metadata": {
    "scrolled": true
   },
   "outputs": [
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "INFO:optimus:jdbc:redshift://yopter.ccyy9l6xansm.us-east-1.redshift.amazonaws.com:5439/yopterdwh?currentSchema=public\n",
      "INFO:optimus:(\n",
      "            SELECT relname as table_name,cast (reltuples as integer) AS count \n",
      "            FROM pg_class C LEFT JOIN pg_namespace N ON (N.oid = C.relnamespace) \n",
      "            WHERE nspname IN ('public') AND relkind='r' ORDER BY reltuples DESC) AS t\n",
      "INFO:optimus:jdbc:redshift://yopter.ccyy9l6xansm.us-east-1.redshift.amazonaws.com:5439/yopterdwh?currentSchema=public\n"
     ]
    },
    {
     "ename": "Py4JJavaError",
     "evalue": "An error occurred while calling o6140.load.\n: java.sql.SQLException: No suitable driver\r\n\tat java.sql.DriverManager.getDriver(DriverManager.java:315)\r\n\tat org.apache.spark.sql.execution.datasources.jdbc.JDBCOptions$$anonfun$7.apply(JDBCOptions.scala:85)\r\n\tat org.apache.spark.sql.execution.datasources.jdbc.JDBCOptions$$anonfun$7.apply(JDBCOptions.scala:85)\r\n\tat scala.Option.getOrElse(Option.scala:121)\r\n\tat org.apache.spark.sql.execution.datasources.jdbc.JDBCOptions.<init>(JDBCOptions.scala:84)\r\n\tat org.apache.spark.sql.execution.datasources.jdbc.JDBCOptions.<init>(JDBCOptions.scala:35)\r\n\tat org.apache.spark.sql.execution.datasources.jdbc.JdbcRelationProvider.createRelation(JdbcRelationProvider.scala:34)\r\n\tat org.apache.spark.sql.execution.datasources.DataSource.resolveRelation(DataSource.scala:340)\r\n\tat org.apache.spark.sql.DataFrameReader.loadV1Source(DataFrameReader.scala:239)\r\n\tat org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:227)\r\n\tat org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:164)\r\n\tat sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)\r\n\tat sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)\r\n\tat sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)\r\n\tat java.lang.reflect.Method.invoke(Method.java:498)\r\n\tat py4j.reflection.MethodInvoker.invoke(MethodInvoker.java:244)\r\n\tat py4j.reflection.ReflectionEngine.invoke(ReflectionEngine.java:357)\r\n\tat py4j.Gateway.invoke(Gateway.java:282)\r\n\tat py4j.commands.AbstractCommand.invokeMethod(AbstractCommand.java:132)\r\n\tat py4j.commands.CallCommand.execute(CallCommand.java:79)\r\n\tat py4j.GatewayConnection.run(GatewayConnection.java:238)\r\n\tat java.lang.Thread.run(Thread.java:748)\r\n",
     "output_type": "error",
     "traceback": [
      "\u001b[1;31m---------------------------------------------------------------------------\u001b[0m",
      "\u001b[1;31mPy4JJavaError\u001b[0m                             Traceback (most recent call last)",
      "\u001b[1;32m<ipython-input-42-f35140062f08>\u001b[0m in \u001b[0;36m<module>\u001b[1;34m\u001b[0m\n\u001b[0;32m     13\u001b[0m \u001b[1;33m\u001b[0m\u001b[0m\n\u001b[0;32m     14\u001b[0m \u001b[1;31m# Show all tables names\u001b[0m\u001b[1;33m\u001b[0m\u001b[1;33m\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n\u001b[1;32m---> 15\u001b[1;33m \u001b[0mdb\u001b[0m\u001b[1;33m.\u001b[0m\u001b[0mtables\u001b[0m\u001b[1;33m(\u001b[0m\u001b[0mlimit\u001b[0m\u001b[1;33m=\u001b[0m\u001b[1;34m\"all\"\u001b[0m\u001b[1;33m)\u001b[0m\u001b[1;33m\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n\u001b[0m",
      "\u001b[1;32m~\\Documents\\Optimus\\optimus\\io\\jdbc.py\u001b[0m in \u001b[0;36mtables\u001b[1;34m(self, schema, database, limit)\u001b[0m\n\u001b[0;32m    123\u001b[0m                     FROM user_tables ORDER BY table_name\"\"\"\n\u001b[0;32m    124\u001b[0m \u001b[1;33m\u001b[0m\u001b[0m\n\u001b[1;32m--> 125\u001b[1;33m         \u001b[0mdf\u001b[0m \u001b[1;33m=\u001b[0m \u001b[0mself\u001b[0m\u001b[1;33m.\u001b[0m\u001b[0mexecute\u001b[0m\u001b[1;33m(\u001b[0m\u001b[0mquery\u001b[0m\u001b[1;33m,\u001b[0m \u001b[0mlimit\u001b[0m\u001b[1;33m)\u001b[0m\u001b[1;33m\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n\u001b[0m\u001b[0;32m    126\u001b[0m         \u001b[1;32mreturn\u001b[0m \u001b[0mdf\u001b[0m\u001b[1;33m.\u001b[0m\u001b[0mtable\u001b[0m\u001b[1;33m(\u001b[0m\u001b[0mlimit\u001b[0m\u001b[1;33m)\u001b[0m\u001b[1;33m\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n\u001b[0;32m    127\u001b[0m \u001b[1;33m\u001b[0m\u001b[0m\n",
      "\u001b[1;32m~\\Documents\\Optimus\\optimus\\io\\jdbc.py\u001b[0m in \u001b[0;36mexecute\u001b[1;34m(self, query, limit)\u001b[0m\n\u001b[0;32m    230\u001b[0m             \u001b[0mconf\u001b[0m\u001b[1;33m.\u001b[0m\u001b[0moption\u001b[0m\u001b[1;33m(\u001b[0m\u001b[1;34m\"driver\"\u001b[0m\u001b[1;33m,\u001b[0m \u001b[0mself\u001b[0m\u001b[1;33m.\u001b[0m\u001b[0mdriver_option\u001b[0m\u001b[1;33m)\u001b[0m\u001b[1;33m\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n\u001b[0;32m    231\u001b[0m \u001b[1;33m\u001b[0m\u001b[0m\n\u001b[1;32m--> 232\u001b[1;33m         \u001b[1;32mreturn\u001b[0m \u001b[0mconf\u001b[0m\u001b[1;33m.\u001b[0m\u001b[0mload\u001b[0m\u001b[1;33m(\u001b[0m\u001b[1;33m)\u001b[0m\u001b[1;33m\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n\u001b[0m\u001b[0;32m    233\u001b[0m \u001b[1;33m\u001b[0m\u001b[0m\n\u001b[0;32m    234\u001b[0m     \u001b[1;32mdef\u001b[0m \u001b[0mdf_to_table\u001b[0m\u001b[1;33m(\u001b[0m\u001b[0mself\u001b[0m\u001b[1;33m,\u001b[0m \u001b[0mdf\u001b[0m\u001b[1;33m,\u001b[0m \u001b[0mtable\u001b[0m\u001b[1;33m,\u001b[0m \u001b[0mmode\u001b[0m\u001b[1;33m=\u001b[0m\u001b[1;34m\"overwrite\"\u001b[0m\u001b[1;33m)\u001b[0m\u001b[1;33m:\u001b[0m\u001b[1;33m\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n",
      "\u001b[1;32m~\\Anaconda3\\lib\\site-packages\\pyspark\\sql\\readwriter.py\u001b[0m in \u001b[0;36mload\u001b[1;34m(self, path, format, schema, **options)\u001b[0m\n\u001b[0;32m    170\u001b[0m             \u001b[1;32mreturn\u001b[0m \u001b[0mself\u001b[0m\u001b[1;33m.\u001b[0m\u001b[0m_df\u001b[0m\u001b[1;33m(\u001b[0m\u001b[0mself\u001b[0m\u001b[1;33m.\u001b[0m\u001b[0m_jreader\u001b[0m\u001b[1;33m.\u001b[0m\u001b[0mload\u001b[0m\u001b[1;33m(\u001b[0m\u001b[0mself\u001b[0m\u001b[1;33m.\u001b[0m\u001b[0m_spark\u001b[0m\u001b[1;33m.\u001b[0m\u001b[0m_sc\u001b[0m\u001b[1;33m.\u001b[0m\u001b[0m_jvm\u001b[0m\u001b[1;33m.\u001b[0m\u001b[0mPythonUtils\u001b[0m\u001b[1;33m.\u001b[0m\u001b[0mtoSeq\u001b[0m\u001b[1;33m(\u001b[0m\u001b[0mpath\u001b[0m\u001b[1;33m)\u001b[0m\u001b[1;33m)\u001b[0m\u001b[1;33m)\u001b[0m\u001b[1;33m\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n\u001b[0;32m    171\u001b[0m         \u001b[1;32melse\u001b[0m\u001b[1;33m:\u001b[0m\u001b[1;33m\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n\u001b[1;32m--> 172\u001b[1;33m             \u001b[1;32mreturn\u001b[0m \u001b[0mself\u001b[0m\u001b[1;33m.\u001b[0m\u001b[0m_df\u001b[0m\u001b[1;33m(\u001b[0m\u001b[0mself\u001b[0m\u001b[1;33m.\u001b[0m\u001b[0m_jreader\u001b[0m\u001b[1;33m.\u001b[0m\u001b[0mload\u001b[0m\u001b[1;33m(\u001b[0m\u001b[1;33m)\u001b[0m\u001b[1;33m)\u001b[0m\u001b[1;33m\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n\u001b[0m\u001b[0;32m    173\u001b[0m \u001b[1;33m\u001b[0m\u001b[0m\n\u001b[0;32m    174\u001b[0m     \u001b[1;33m@\u001b[0m\u001b[0msince\u001b[0m\u001b[1;33m(\u001b[0m\u001b[1;36m1.4\u001b[0m\u001b[1;33m)\u001b[0m\u001b[1;33m\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n",
      "\u001b[1;32m~\\Anaconda3\\lib\\site-packages\\py4j\\java_gateway.py\u001b[0m in \u001b[0;36m__call__\u001b[1;34m(self, *args)\u001b[0m\n\u001b[0;32m   1255\u001b[0m         \u001b[0manswer\u001b[0m \u001b[1;33m=\u001b[0m \u001b[0mself\u001b[0m\u001b[1;33m.\u001b[0m\u001b[0mgateway_client\u001b[0m\u001b[1;33m.\u001b[0m\u001b[0msend_command\u001b[0m\u001b[1;33m(\u001b[0m\u001b[0mcommand\u001b[0m\u001b[1;33m)\u001b[0m\u001b[1;33m\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n\u001b[0;32m   1256\u001b[0m         return_value = get_return_value(\n\u001b[1;32m-> 1257\u001b[1;33m             answer, self.gateway_client, self.target_id, self.name)\n\u001b[0m\u001b[0;32m   1258\u001b[0m \u001b[1;33m\u001b[0m\u001b[0m\n\u001b[0;32m   1259\u001b[0m         \u001b[1;32mfor\u001b[0m \u001b[0mtemp_arg\u001b[0m \u001b[1;32min\u001b[0m \u001b[0mtemp_args\u001b[0m\u001b[1;33m:\u001b[0m\u001b[1;33m\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n",
      "\u001b[1;32m~\\Anaconda3\\lib\\site-packages\\pyspark\\sql\\utils.py\u001b[0m in \u001b[0;36mdeco\u001b[1;34m(*a, **kw)\u001b[0m\n\u001b[0;32m     61\u001b[0m     \u001b[1;32mdef\u001b[0m \u001b[0mdeco\u001b[0m\u001b[1;33m(\u001b[0m\u001b[1;33m*\u001b[0m\u001b[0ma\u001b[0m\u001b[1;33m,\u001b[0m \u001b[1;33m**\u001b[0m\u001b[0mkw\u001b[0m\u001b[1;33m)\u001b[0m\u001b[1;33m:\u001b[0m\u001b[1;33m\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n\u001b[0;32m     62\u001b[0m         \u001b[1;32mtry\u001b[0m\u001b[1;33m:\u001b[0m\u001b[1;33m\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n\u001b[1;32m---> 63\u001b[1;33m             \u001b[1;32mreturn\u001b[0m \u001b[0mf\u001b[0m\u001b[1;33m(\u001b[0m\u001b[1;33m*\u001b[0m\u001b[0ma\u001b[0m\u001b[1;33m,\u001b[0m \u001b[1;33m**\u001b[0m\u001b[0mkw\u001b[0m\u001b[1;33m)\u001b[0m\u001b[1;33m\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n\u001b[0m\u001b[0;32m     64\u001b[0m         \u001b[1;32mexcept\u001b[0m \u001b[0mpy4j\u001b[0m\u001b[1;33m.\u001b[0m\u001b[0mprotocol\u001b[0m\u001b[1;33m.\u001b[0m\u001b[0mPy4JJavaError\u001b[0m \u001b[1;32mas\u001b[0m \u001b[0me\u001b[0m\u001b[1;33m:\u001b[0m\u001b[1;33m\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n\u001b[0;32m     65\u001b[0m             \u001b[0ms\u001b[0m \u001b[1;33m=\u001b[0m \u001b[0me\u001b[0m\u001b[1;33m.\u001b[0m\u001b[0mjava_exception\u001b[0m\u001b[1;33m.\u001b[0m\u001b[0mtoString\u001b[0m\u001b[1;33m(\u001b[0m\u001b[1;33m)\u001b[0m\u001b[1;33m\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n",
      "\u001b[1;32m~\\Anaconda3\\lib\\site-packages\\py4j\\protocol.py\u001b[0m in \u001b[0;36mget_return_value\u001b[1;34m(answer, gateway_client, target_id, name)\u001b[0m\n\u001b[0;32m    326\u001b[0m                 raise Py4JJavaError(\n\u001b[0;32m    327\u001b[0m                     \u001b[1;34m\"An error occurred while calling {0}{1}{2}.\\n\"\u001b[0m\u001b[1;33m.\u001b[0m\u001b[1;33m\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n\u001b[1;32m--> 328\u001b[1;33m                     format(target_id, \".\", name), value)\n\u001b[0m\u001b[0;32m    329\u001b[0m             \u001b[1;32melse\u001b[0m\u001b[1;33m:\u001b[0m\u001b[1;33m\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n\u001b[0;32m    330\u001b[0m                 raise Py4JError(\n",
      "\u001b[1;31mPy4JJavaError\u001b[0m: An error occurred while calling o6140.load.\n: java.sql.SQLException: No suitable driver\r\n\tat java.sql.DriverManager.getDriver(DriverManager.java:315)\r\n\tat org.apache.spark.sql.execution.datasources.jdbc.JDBCOptions$$anonfun$7.apply(JDBCOptions.scala:85)\r\n\tat org.apache.spark.sql.execution.datasources.jdbc.JDBCOptions$$anonfun$7.apply(JDBCOptions.scala:85)\r\n\tat scala.Option.getOrElse(Option.scala:121)\r\n\tat org.apache.spark.sql.execution.datasources.jdbc.JDBCOptions.<init>(JDBCOptions.scala:84)\r\n\tat org.apache.spark.sql.execution.datasources.jdbc.JDBCOptions.<init>(JDBCOptions.scala:35)\r\n\tat org.apache.spark.sql.execution.datasources.jdbc.JdbcRelationProvider.createRelation(JdbcRelationProvider.scala:34)\r\n\tat org.apache.spark.sql.execution.datasources.DataSource.resolveRelation(DataSource.scala:340)\r\n\tat org.apache.spark.sql.DataFrameReader.loadV1Source(DataFrameReader.scala:239)\r\n\tat org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:227)\r\n\tat org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:164)\r\n\tat sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)\r\n\tat sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)\r\n\tat sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)\r\n\tat java.lang.reflect.Method.invoke(Method.java:498)\r\n\tat py4j.reflection.MethodInvoker.invoke(MethodInvoker.java:244)\r\n\tat py4j.reflection.ReflectionEngine.invoke(ReflectionEngine.java:357)\r\n\tat py4j.Gateway.invoke(Gateway.java:282)\r\n\tat py4j.commands.AbstractCommand.invokeMethod(AbstractCommand.java:132)\r\n\tat py4j.commands.CallCommand.execute(CallCommand.java:79)\r\n\tat py4j.GatewayConnection.run(GatewayConnection.java:238)\r\n\tat java.lang.Thread.run(Thread.java:748)\r\n"
     ]
    }
   ],
   "source": [
    "# This import is only to hide the credentials\n",
    "from credentials import *\n",
    "\n",
    "# For others databases use in db_type accepts 'oracle','mysql','redshift','postgres'\n",
    "\n",
    "db =  op.connect(\n",
    "    db_type=DB_TYPE,\n",
    "    host=HOST,\n",
    "    database= DATABASE,\n",
    "    user= USER,\n",
    "    password = PASSWORD,\n",
    "    port=PORT)\n",
    "    \n",
    "# Show all tables names\n",
    "db.tables(limit=\"all\")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# # Show a summary of every table\n",
    "db.table.show(\"*\",20)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# # Get a table as dataframe\n",
    "df_ = db.table_to_df(\"places_interest\").table()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# # Create new table in the database\n",
    "db.df_to_table(df, \"new_table\")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Data enrichment\n",
    "\n",
    "You can connect to any external API to enrich your data using Optimus. Optimus uses MongoDB to download the data and then merge it with the Spark Dataframe. You need to install MongoDB\n",
    "\n",
    "Let's load a tiny dataset we can enrich"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "df = op.load.json(\"https://raw.githubusercontent.com/ironmussa/Optimus/master/examples/data/foo.json\")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "import requests\n",
    "\n",
    "def func_request(params):\n",
    "    # You can use here whatever header or auth info you need to send. \n",
    "    # For more information see the requests library\n",
    "    \n",
    "    url= \"https://jsonplaceholder.typicode.com/todos/\" + str(params[\"id\"])\n",
    "    return requests.get(url)\n",
    "\n",
    "def func_response(response):\n",
    "    # Here you can parse de response\n",
    "    return response[\"title\"]\n",
    "\n",
    "\n",
    "e = op.enrich(host=\"localhost\", port=27017, db_name=\"jazz\")\n",
    "\n",
    "df_result = e.run(df, func_request, func_response, calls= 60, period = 60, max_tries = 8)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "df_result.table(\"all\")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "df_result.table_image(\"images/table7.png\")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Clustering Strings"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Optimus implements some funciton to cluster Strings. We get graet inspiration from OpenRefine\n",
    "\n",
    "Here a quote from its site:\n",
    "\n",
    "\"In OpenRefine, clustering refers to the operation of \"finding groups of different values that might be alternative representations of the same thing\". For example, the two strings \"New York\" and \"new york\" are very likely to refer to the same concept and just have capitalization differences. Likewise, \"Gödel\" and \"Godel\" probably refer to the same person.\"\n",
    "\n",
    "For more informacion see this:\n",
    "https://github.com/OpenRefine/OpenRefine/wiki/Clustering-In-Depth"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Keycolision"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 43,
   "metadata": {},
   "outputs": [],
   "source": [
    "df = op.read.csv(\"../examples/data/random.csv\",header=True, sep=\";\")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 44,
   "metadata": {},
   "outputs": [],
   "source": [
    "from optimus.ml import keycollision as keyCol"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 45,
   "metadata": {},
   "outputs": [
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "INFO:optimus:Using 'column_exp' to process column 'STATE_FINGERPRINT' with function _trim\n",
      "INFO:optimus:Using 'column_exp' to process column 'STATE_FINGERPRINT' with function _lower\n",
      "INFO:optimus:Using 'pandas_udf' to process column 'STATE_FINGERPRINT' with function multiple_replace\n",
      "INFO:optimus:Using 'pandas_udf' to process column 'STATE_FINGERPRINT' with function _remove_accents\n",
      "INFO:optimus:Using 'pandas_udf' to process column 'STATE_FINGERPRINT' with function _split_sort_remove_join\n"
     ]
    },
    {
     "data": {
      "text/html": [
       "\n",
       "\n",
       "\n",
       "\n",
       "\n",
       "<div class=\"info_items\">Viewing 5 of 5 rows / 4 columns</div>\n",
       "<div class=\"info_items\">200 partition(s)</div>\n",
       "\n",
       "<table class=\"optimus_table\">\n",
       "    <thead>\n",
       "    <tr>\n",
       "        \n",
       "        <th>\n",
       "            <div class=\"column_name\">STATE_CLUSTER_SIZE</div>\n",
       "            <div class=\"data_type\">1 (int)</div>\n",
       "            <div class=\"data_type\">\n",
       "                \n",
       "                not nullable\n",
       "                \n",
       "            </div>\n",
       "        </th>\n",
       "        \n",
       "        <th>\n",
       "            <div class=\"column_name\">STATE_CLUSTER</div>\n",
       "            <div class=\"data_type\">2 (array&lt;string&gt;)</div>\n",
       "            <div class=\"data_type\">\n",
       "                \n",
       "                nullable\n",
       "                \n",
       "            </div>\n",
       "        </th>\n",
       "        \n",
       "        <th>\n",
       "            <div class=\"column_name\">STATE_COUNT</div>\n",
       "            <div class=\"data_type\">3 (bigint)</div>\n",
       "            <div class=\"data_type\">\n",
       "                \n",
       "                nullable\n",
       "                \n",
       "            </div>\n",
       "        </th>\n",
       "        \n",
       "        <th>\n",
       "            <div class=\"column_name\">STATE_RECOMMENDED</div>\n",
       "            <div class=\"data_type\">4 (string)</div>\n",
       "            <div class=\"data_type\">\n",
       "                \n",
       "                nullable\n",
       "                \n",
       "            </div>\n",
       "        </th>\n",
       "        \n",
       "    </tr>\n",
       "\n",
       "    </thead>\n",
       "    <tbody>\n",
       "    \n",
       "    <tr>\n",
       "        \n",
       "        <td>\n",
       "            <div class=\" \"\n",
       "                 title='1'>1\n",
       "            </div>\n",
       "        </td>\n",
       "        \n",
       "        <td>\n",
       "            <div class=\" \"\n",
       "                 title='['Estado&#8901;de&#8901;México']'>['Estado&#8901;de&#8901;México']\n",
       "            </div>\n",
       "        </td>\n",
       "        \n",
       "        <td>\n",
       "            <div class=\" \"\n",
       "                 title='810'>810\n",
       "            </div>\n",
       "        </td>\n",
       "        \n",
       "        <td>\n",
       "            <div class=\" \"\n",
       "                 title='Estado&#8901;de&#8901;México'>Estado&#8901;de&#8901;México\n",
       "            </div>\n",
       "        </td>\n",
       "        \n",
       "    </tr>\n",
       "    \n",
       "    <tr>\n",
       "        \n",
       "        <td>\n",
       "            <div class=\" \"\n",
       "                 title='2'>2\n",
       "            </div>\n",
       "        </td>\n",
       "        \n",
       "        <td>\n",
       "            <div class=\" \"\n",
       "                 title='['México&#8901;D.F.',&#8901;'Mexico&#8901;D.F.']'>['México&#8901;D.F.',&#8901;'Mexico&#8901;D.F.']\n",
       "            </div>\n",
       "        </td>\n",
       "        \n",
       "        <td>\n",
       "            <div class=\" \"\n",
       "                 title='2495'>2495\n",
       "            </div>\n",
       "        </td>\n",
       "        \n",
       "        <td>\n",
       "            <div class=\" \"\n",
       "                 title='Mexico&#8901;D.F.'>Mexico&#8901;D.F.\n",
       "            </div>\n",
       "        </td>\n",
       "        \n",
       "    </tr>\n",
       "    \n",
       "    <tr>\n",
       "        \n",
       "        <td>\n",
       "            <div class=\" \"\n",
       "                 title='1'>1\n",
       "            </div>\n",
       "        </td>\n",
       "        \n",
       "        <td>\n",
       "            <div class=\" \"\n",
       "                 title='['D.F.']'>['D.F.']\n",
       "            </div>\n",
       "        </td>\n",
       "        \n",
       "        <td>\n",
       "            <div class=\" \"\n",
       "                 title='66'>66\n",
       "            </div>\n",
       "        </td>\n",
       "        \n",
       "        <td>\n",
       "            <div class=\" \"\n",
       "                 title='D.F.'>D.F.\n",
       "            </div>\n",
       "        </td>\n",
       "        \n",
       "    </tr>\n",
       "    \n",
       "    <tr>\n",
       "        \n",
       "        <td>\n",
       "            <div class=\" \"\n",
       "                 title='1'>1\n",
       "            </div>\n",
       "        </td>\n",
       "        \n",
       "        <td>\n",
       "            <div class=\" \"\n",
       "                 title='['Distriro&#8901;Federal']'>['Distriro&#8901;Federal']\n",
       "            </div>\n",
       "        </td>\n",
       "        \n",
       "        <td>\n",
       "            <div class=\" \"\n",
       "                 title='259'>259\n",
       "            </div>\n",
       "        </td>\n",
       "        \n",
       "        <td>\n",
       "            <div class=\" \"\n",
       "                 title='Distriro&#8901;Federal'>Distriro&#8901;Federal\n",
       "            </div>\n",
       "        </td>\n",
       "        \n",
       "    </tr>\n",
       "    \n",
       "    <tr>\n",
       "        \n",
       "        <td>\n",
       "            <div class=\" \"\n",
       "                 title='3'>3\n",
       "            </div>\n",
       "        </td>\n",
       "        \n",
       "        <td>\n",
       "            <div class=\" \"\n",
       "                 title='['Distrito&#8901;Federal',&#8901;'DISTRITO&#8901;FEDERAL',&#8901;'distrito&#8901;federal']'>['Distrito&#8901;Federal',&#8901;'DISTRITO&#8901;FEDERAL',&#8901;'distrito&#8901;federal']\n",
       "            </div>\n",
       "        </td>\n",
       "        \n",
       "        <td>\n",
       "            <div class=\" \"\n",
       "                 title='11930'>11930\n",
       "            </div>\n",
       "        </td>\n",
       "        \n",
       "        <td>\n",
       "            <div class=\" \"\n",
       "                 title='Distrito&#8901;Federal'>Distrito&#8901;Federal\n",
       "            </div>\n",
       "        </td>\n",
       "        \n",
       "    </tr>\n",
       "    \n",
       "    </tbody>\n",
       "</table>\n",
       "\n",
       "\n",
       "<div class=\"info_items\">Viewing 5 of 5 rows / 4 columns</div>\n",
       "<div class=\"info_items\">200 partition(s)</div>\n"
      ],
      "text/plain": [
       "<IPython.core.display.HTML object>"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Loading page (1/2)\n",
      "Rendering (2/2)                                                    \n",
      "Done                                                               \n"
     ]
    },
    {
     "data": {
      "text/html": [
       "<img src='images/table8.png'>"
      ],
      "text/plain": [
       "<IPython.core.display.HTML object>"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    }
   ],
   "source": [
    "df_kc = keyCol.fingerprint_cluster(df, 'STATE')\n",
    "df_kc.table()\n",
    "df_kc.table_image(\"images/table8.png\")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 46,
   "metadata": {},
   "outputs": [
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "INFO:optimus:Using 'column_exp' to process column 'STATE_FINGERPRINT' with function _trim\n",
      "INFO:optimus:Using 'column_exp' to process column 'STATE_FINGERPRINT' with function _lower\n",
      "INFO:optimus:Using 'pandas_udf' to process column 'STATE_FINGERPRINT' with function multiple_replace\n",
      "INFO:optimus:Using 'pandas_udf' to process column 'STATE_FINGERPRINT' with function _remove_accents\n",
      "INFO:optimus:Using 'pandas_udf' to process column 'STATE_FINGERPRINT' with function _split_sort_remove_join\n"
     ]
    },
    {
     "data": {
      "text/plain": [
       "[{'STATE_CLUSTER_SIZE': 1,\n",
       "  'STATE_CLUSTER': ['Estado de México'],\n",
       "  'STATE_COUNT': 810,\n",
       "  'STATE_RECOMMENDED': 'Estado de México'},\n",
       " {'STATE_CLUSTER_SIZE': 2,\n",
       "  'STATE_CLUSTER': ['México D.F.', 'Mexico D.F.'],\n",
       "  'STATE_COUNT': 2495,\n",
       "  'STATE_RECOMMENDED': 'Mexico D.F.'},\n",
       " {'STATE_CLUSTER_SIZE': 1,\n",
       "  'STATE_CLUSTER': ['D.F.'],\n",
       "  'STATE_COUNT': 66,\n",
       "  'STATE_RECOMMENDED': 'D.F.'},\n",
       " {'STATE_CLUSTER_SIZE': 1,\n",
       "  'STATE_CLUSTER': ['Distriro Federal'],\n",
       "  'STATE_COUNT': 259,\n",
       "  'STATE_RECOMMENDED': 'Distriro Federal'},\n",
       " {'STATE_CLUSTER_SIZE': 3,\n",
       "  'STATE_CLUSTER': ['Distrito Federal',\n",
       "   'DISTRITO FEDERAL',\n",
       "   'distrito federal'],\n",
       "  'STATE_COUNT': 11930,\n",
       "  'STATE_RECOMMENDED': 'Distrito Federal'}]"
      ]
     },
     "execution_count": 46,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "keyCol.fingerprint_cluster(df, \"STATE\").to_json()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 47,
   "metadata": {},
   "outputs": [
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "INFO:optimus:Using 'column_exp' to process column 'STATE_NGRAM' with function _lower\n",
      "INFO:optimus:Using 'column_exp' to process column 'STATE_NGRAM' with function _remove_white_spaces\n",
      "INFO:optimus:Using 'pandas_udf' to process column 'STATE_NGRAM' with function multiple_replace\n",
      "INFO:optimus:Using 'pandas_udf' to process column 'STATE_NGRAM' with function _remove_accents\n",
      "INFO:optimus:Using 'column_exp' to process column 'count' with function _cast_to\n",
      "INFO:optimus:Using 'column_exp' to process column 'STATE' with function _cast_to\n",
      "INFO:optimus:Using 'column_exp' to process column 'STATE_NGRAM' with function _cast_to\n",
      "INFO:optimus:Using 'column_exp' to process column 'STATE_NGRAM' with function func_col_exp\n",
      "INFO:optimus:Using 'pandas_udf' to process column 'STATE_NGRAM_FINGERPRINT' with function remote_white_spaces_remove_sort_join\n"
     ]
    },
    {
     "data": {
      "text/html": [
       "\n",
       "\n",
       "\n",
       "\n",
       "\n",
       "<div class=\"info_items\">Viewing 1 of 1 rows / 4 columns</div>\n",
       "<div class=\"info_items\">200 partition(s)</div>\n",
       "\n",
       "<table class=\"optimus_table\">\n",
       "    <thead>\n",
       "    <tr>\n",
       "        \n",
       "        <th>\n",
       "            <div class=\"column_name\">STATE_CLUSTER_SIZE</div>\n",
       "            <div class=\"data_type\">1 (int)</div>\n",
       "            <div class=\"data_type\">\n",
       "                \n",
       "                not nullable\n",
       "                \n",
       "            </div>\n",
       "        </th>\n",
       "        \n",
       "        <th>\n",
       "            <div class=\"column_name\">STATE_CLUSTER</div>\n",
       "            <div class=\"data_type\">2 (array&lt;string&gt;)</div>\n",
       "            <div class=\"data_type\">\n",
       "                \n",
       "                nullable\n",
       "                \n",
       "            </div>\n",
       "        </th>\n",
       "        \n",
       "        <th>\n",
       "            <div class=\"column_name\">STATE_COUNT</div>\n",
       "            <div class=\"data_type\">3 (double)</div>\n",
       "            <div class=\"data_type\">\n",
       "                \n",
       "                nullable\n",
       "                \n",
       "            </div>\n",
       "        </th>\n",
       "        \n",
       "        <th>\n",
       "            <div class=\"column_name\">STATE_RECOMMENDED</div>\n",
       "            <div class=\"data_type\">4 (string)</div>\n",
       "            <div class=\"data_type\">\n",
       "                \n",
       "                nullable\n",
       "                \n",
       "            </div>\n",
       "        </th>\n",
       "        \n",
       "    </tr>\n",
       "\n",
       "    </thead>\n",
       "    <tbody>\n",
       "    \n",
       "    <tr>\n",
       "        \n",
       "        <td>\n",
       "            <div class=\" \"\n",
       "                 title='8'>8\n",
       "            </div>\n",
       "        </td>\n",
       "        \n",
       "        <td>\n",
       "            <div class=\" \"\n",
       "                 title='['Distrito&#8901;Federal',&#8901;'México&#8901;D.F.',&#8901;'DISTRITO&#8901;FEDERAL',&#8901;'Mexico&#8901;D.F.',&#8901;'Distriro&#8901;Federal',&#8901;'D.F.',&#8901;'Estado&#8901;de&#8901;México',&#8901;'distrito&#8901;federal']'>['Distrito&#8901;Federal',&#8901;'México&#8901;D.F.',&#8901;'DISTRITO&#8901;FEDERAL',&#8901;'Mexico&#8901;D.F.',&#8901;'Distr...\n",
       "            </div>\n",
       "        </td>\n",
       "        \n",
       "        <td>\n",
       "            <div class=\" \"\n",
       "                 title='15560.0'>15560.0\n",
       "            </div>\n",
       "        </td>\n",
       "        \n",
       "        <td>\n",
       "            <div class=\" \"\n",
       "                 title='Mexico&#8901;D.F.'>Mexico&#8901;D.F.\n",
       "            </div>\n",
       "        </td>\n",
       "        \n",
       "    </tr>\n",
       "    \n",
       "    </tbody>\n",
       "</table>\n",
       "\n",
       "\n",
       "<div class=\"info_items\">Viewing 1 of 1 rows / 4 columns</div>\n",
       "<div class=\"info_items\">200 partition(s)</div>\n"
      ],
      "text/plain": [
       "<IPython.core.display.HTML object>"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Loading page (1/2)\n",
      "Rendering (2/2)                                                    \n",
      "Done                                                               \n"
     ]
    },
    {
     "data": {
      "text/html": [
       "<img src='images/table9.png'>"
      ],
      "text/plain": [
       "<IPython.core.display.HTML object>"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    }
   ],
   "source": [
    "df_kc = keyCol.n_gram_fingerprint_cluster(df, \"STATE\" , 2)\n",
    "df_kc.table()\n",
    "df_kc.table_image(\"images/table9.png\")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 48,
   "metadata": {},
   "outputs": [
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "INFO:optimus:Using 'column_exp' to process column 'STATE_NGRAM' with function _lower\n",
      "INFO:optimus:Using 'column_exp' to process column 'STATE_NGRAM' with function _remove_white_spaces\n",
      "INFO:optimus:Using 'pandas_udf' to process column 'STATE_NGRAM' with function multiple_replace\n",
      "INFO:optimus:Using 'pandas_udf' to process column 'STATE_NGRAM' with function _remove_accents\n",
      "INFO:optimus:Using 'column_exp' to process column 'count' with function _cast_to\n",
      "INFO:optimus:Using 'column_exp' to process column 'STATE' with function _cast_to\n",
      "INFO:optimus:Using 'column_exp' to process column 'STATE_NGRAM' with function _cast_to\n",
      "INFO:optimus:Using 'column_exp' to process column 'STATE_NGRAM' with function func_col_exp\n",
      "INFO:optimus:Using 'pandas_udf' to process column 'STATE_NGRAM_FINGERPRINT' with function remote_white_spaces_remove_sort_join\n"
     ]
    },
    {
     "data": {
      "text/plain": [
       "[{'STATE_CLUSTER_SIZE': 8,\n",
       "  'STATE_CLUSTER': ['Distrito Federal',\n",
       "   'México D.F.',\n",
       "   'DISTRITO FEDERAL',\n",
       "   'Mexico D.F.',\n",
       "   'Distriro Federal',\n",
       "   'D.F.',\n",
       "   'Estado de México',\n",
       "   'distrito federal'],\n",
       "  'STATE_COUNT': 15560.0,\n",
       "  'STATE_RECOMMENDED': 'Mexico D.F.'}]"
      ]
     },
     "execution_count": 48,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "keyCol.n_gram_fingerprint_cluster(df, \"STATE\" , 2).to_json()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Nearest Neighbor Methods"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 49,
   "metadata": {},
   "outputs": [
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "INFO:optimus:Using 'column_exp' to process column 'STATE_FINGERPRINT' with function _trim\n",
      "INFO:optimus:Using 'column_exp' to process column 'STATE_FINGERPRINT' with function _lower\n",
      "INFO:optimus:Using 'pandas_udf' to process column 'STATE_FINGERPRINT' with function multiple_replace\n",
      "INFO:optimus:Using 'pandas_udf' to process column 'STATE_FINGERPRINT' with function _remove_accents\n",
      "INFO:optimus:Using 'pandas_udf' to process column 'STATE_FINGERPRINT' with function _split_sort_remove_join\n"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Loading page (1/2)\n",
      "Rendering (2/2)                                                    \n",
      "Done                                                               \n"
     ]
    },
    {
     "data": {
      "text/html": [
       "<img src='images/table10.png'>"
      ],
      "text/plain": [
       "<IPython.core.display.HTML object>"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    }
   ],
   "source": [
    "from optimus.ml import distancecluster as dc\n",
    "df_dc = dc.levenshtein_matrix(df,\"STATE\")\n",
    "df_dc.table_image(\"images/table10.png\")\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 50,
   "metadata": {},
   "outputs": [
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "INFO:optimus:Using 'column_exp' to process column 'STATE_FINGERPRINT' with function _trim\n",
      "INFO:optimus:Using 'column_exp' to process column 'STATE_FINGERPRINT' with function _lower\n",
      "INFO:optimus:Using 'pandas_udf' to process column 'STATE_FINGERPRINT' with function multiple_replace\n",
      "INFO:optimus:Using 'pandas_udf' to process column 'STATE_FINGERPRINT' with function _remove_accents\n",
      "INFO:optimus:Using 'pandas_udf' to process column 'STATE_FINGERPRINT' with function _split_sort_remove_join\n"
     ]
    },
    {
     "data": {
      "text/html": [
       "\n",
       "\n",
       "\n",
       "\n",
       "\n",
       "<div class=\"info_items\">Viewing 5 of 5 rows / 3 columns</div>\n",
       "<div class=\"info_items\">200 partition(s)</div>\n",
       "\n",
       "<table class=\"optimus_table\">\n",
       "    <thead>\n",
       "    <tr>\n",
       "        \n",
       "        <th>\n",
       "            <div class=\"column_name\">STATE_FROM</div>\n",
       "            <div class=\"data_type\">1 (string)</div>\n",
       "            <div class=\"data_type\">\n",
       "                \n",
       "                nullable\n",
       "                \n",
       "            </div>\n",
       "        </th>\n",
       "        \n",
       "        <th>\n",
       "            <div class=\"column_name\">STATE_LEVENSHTEIN_DISTANCE</div>\n",
       "            <div class=\"data_type\">2 (int)</div>\n",
       "            <div class=\"data_type\">\n",
       "                \n",
       "                nullable\n",
       "                \n",
       "            </div>\n",
       "        </th>\n",
       "        \n",
       "        <th>\n",
       "            <div class=\"column_name\">STATE_TO</div>\n",
       "            <div class=\"data_type\">3 (string)</div>\n",
       "            <div class=\"data_type\">\n",
       "                \n",
       "                nullable\n",
       "                \n",
       "            </div>\n",
       "        </th>\n",
       "        \n",
       "    </tr>\n",
       "\n",
       "    </thead>\n",
       "    <tbody>\n",
       "    \n",
       "    <tr>\n",
       "        \n",
       "        <td>\n",
       "            <div class=\" \"\n",
       "                 title='estadodemexico'>estadodemexico\n",
       "            </div>\n",
       "        </td>\n",
       "        \n",
       "        <td>\n",
       "            <div class=\" \"\n",
       "                 title='10'>10\n",
       "            </div>\n",
       "        </td>\n",
       "        \n",
       "        <td>\n",
       "            <div class=\" \"\n",
       "                 title='mexicodf'>mexicodf\n",
       "            </div>\n",
       "        </td>\n",
       "        \n",
       "    </tr>\n",
       "    \n",
       "    <tr>\n",
       "        \n",
       "        <td>\n",
       "            <div class=\" \"\n",
       "                 title='df'>df\n",
       "            </div>\n",
       "        </td>\n",
       "        \n",
       "        <td>\n",
       "            <div class=\" \"\n",
       "                 title='6'>6\n",
       "            </div>\n",
       "        </td>\n",
       "        \n",
       "        <td>\n",
       "            <div class=\" \"\n",
       "                 title='mexicodf'>mexicodf\n",
       "            </div>\n",
       "        </td>\n",
       "        \n",
       "    </tr>\n",
       "    \n",
       "    <tr>\n",
       "        \n",
       "        <td>\n",
       "            <div class=\" \"\n",
       "                 title='distrirofederal'>distrirofederal\n",
       "            </div>\n",
       "        </td>\n",
       "        \n",
       "        <td>\n",
       "            <div class=\" \"\n",
       "                 title='1'>1\n",
       "            </div>\n",
       "        </td>\n",
       "        \n",
       "        <td>\n",
       "            <div class=\" \"\n",
       "                 title='distritofederal'>distritofederal\n",
       "            </div>\n",
       "        </td>\n",
       "        \n",
       "    </tr>\n",
       "    \n",
       "    <tr>\n",
       "        \n",
       "        <td>\n",
       "            <div class=\" \"\n",
       "                 title='distritofederal'>distritofederal\n",
       "            </div>\n",
       "        </td>\n",
       "        \n",
       "        <td>\n",
       "            <div class=\" \"\n",
       "                 title='1'>1\n",
       "            </div>\n",
       "        </td>\n",
       "        \n",
       "        <td>\n",
       "            <div class=\" \"\n",
       "                 title='distrirofederal'>distrirofederal\n",
       "            </div>\n",
       "        </td>\n",
       "        \n",
       "    </tr>\n",
       "    \n",
       "    <tr>\n",
       "        \n",
       "        <td>\n",
       "            <div class=\" \"\n",
       "                 title='mexicodf'>mexicodf\n",
       "            </div>\n",
       "        </td>\n",
       "        \n",
       "        <td>\n",
       "            <div class=\" \"\n",
       "                 title='6'>6\n",
       "            </div>\n",
       "        </td>\n",
       "        \n",
       "        <td>\n",
       "            <div class=\" \"\n",
       "                 title='df'>df\n",
       "            </div>\n",
       "        </td>\n",
       "        \n",
       "    </tr>\n",
       "    \n",
       "    </tbody>\n",
       "</table>\n",
       "\n",
       "\n",
       "<div class=\"info_items\">Viewing 5 of 5 rows / 3 columns</div>\n",
       "<div class=\"info_items\">200 partition(s)</div>\n"
      ],
      "text/plain": [
       "<IPython.core.display.HTML object>"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Loading page (1/2)\n",
      "Rendering (2/2)                                                    \n",
      "Done                                                               \n"
     ]
    },
    {
     "data": {
      "text/html": [
       "<img src='images/table11.png'>"
      ],
      "text/plain": [
       "<IPython.core.display.HTML object>"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    }
   ],
   "source": [
    "df_dc=dc.levenshtein_filter(df,\"STATE\")\n",
    "df_dc.table()\n",
    "df_dc.table_image(\"images/table11.png\")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 51,
   "metadata": {},
   "outputs": [
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "INFO:optimus:Using 'column_exp' to process column 'STATE_FINGERPRINT' with function _trim\n",
      "INFO:optimus:Using 'column_exp' to process column 'STATE_FINGERPRINT' with function _lower\n",
      "INFO:optimus:Using 'pandas_udf' to process column 'STATE_FINGERPRINT' with function multiple_replace\n",
      "INFO:optimus:Using 'pandas_udf' to process column 'STATE_FINGERPRINT' with function _remove_accents\n",
      "INFO:optimus:Using 'pandas_udf' to process column 'STATE_FINGERPRINT' with function _split_sort_remove_join\n",
      "INFO:optimus:Using 'column_exp' to process column 'STATE_FINGERPRINT' with function _trim\n",
      "INFO:optimus:Using 'column_exp' to process column 'STATE_FINGERPRINT' with function _lower\n",
      "INFO:optimus:Using 'pandas_udf' to process column 'STATE_FINGERPRINT' with function multiple_replace\n",
      "INFO:optimus:Using 'pandas_udf' to process column 'STATE_FINGERPRINT' with function _remove_accents\n",
      "INFO:optimus:Using 'pandas_udf' to process column 'STATE_FINGERPRINT' with function _split_sort_remove_join\n"
     ]
    },
    {
     "data": {
      "text/html": [
       "\n",
       "\n",
       "\n",
       "\n",
       "\n",
       "<div class=\"info_items\">Viewing 5 of 5 rows / 4 columns</div>\n",
       "<div class=\"info_items\">1 partition(s)</div>\n",
       "\n",
       "<table class=\"optimus_table\">\n",
       "    <thead>\n",
       "    <tr>\n",
       "        \n",
       "        <th>\n",
       "            <div class=\"column_name\">STATE_CLUSTER</div>\n",
       "            <div class=\"data_type\">1 (array&lt;string&gt;)</div>\n",
       "            <div class=\"data_type\">\n",
       "                \n",
       "                nullable\n",
       "                \n",
       "            </div>\n",
       "        </th>\n",
       "        \n",
       "        <th>\n",
       "            <div class=\"column_name\">STATE_CLUSTER_SIZE</div>\n",
       "            <div class=\"data_type\">2 (int)</div>\n",
       "            <div class=\"data_type\">\n",
       "                \n",
       "                nullable\n",
       "                \n",
       "            </div>\n",
       "        </th>\n",
       "        \n",
       "        <th>\n",
       "            <div class=\"column_name\">STATE_RECOMMENDED</div>\n",
       "            <div class=\"data_type\">3 (string)</div>\n",
       "            <div class=\"data_type\">\n",
       "                \n",
       "                nullable\n",
       "                \n",
       "            </div>\n",
       "        </th>\n",
       "        \n",
       "        <th>\n",
       "            <div class=\"column_name\">STATE_COUNT</div>\n",
       "            <div class=\"data_type\">4 (bigint)</div>\n",
       "            <div class=\"data_type\">\n",
       "                \n",
       "                nullable\n",
       "                \n",
       "            </div>\n",
       "        </th>\n",
       "        \n",
       "    </tr>\n",
       "\n",
       "    </thead>\n",
       "    <tbody>\n",
       "    \n",
       "    <tr>\n",
       "        \n",
       "        <td>\n",
       "            <div class=\" \"\n",
       "                 title='['Estado&#8901;de&#8901;México']'>['Estado&#8901;de&#8901;México']\n",
       "            </div>\n",
       "        </td>\n",
       "        \n",
       "        <td>\n",
       "            <div class=\" \"\n",
       "                 title='1'>1\n",
       "            </div>\n",
       "        </td>\n",
       "        \n",
       "        <td>\n",
       "            <div class=\" \"\n",
       "                 title='Estado&#8901;de&#8901;México'>Estado&#8901;de&#8901;México\n",
       "            </div>\n",
       "        </td>\n",
       "        \n",
       "        <td>\n",
       "            <div class=\" \"\n",
       "                 title='810'>810\n",
       "            </div>\n",
       "        </td>\n",
       "        \n",
       "    </tr>\n",
       "    \n",
       "    <tr>\n",
       "        \n",
       "        <td>\n",
       "            <div class=\" \"\n",
       "                 title='['D.F.']'>['D.F.']\n",
       "            </div>\n",
       "        </td>\n",
       "        \n",
       "        <td>\n",
       "            <div class=\" \"\n",
       "                 title='1'>1\n",
       "            </div>\n",
       "        </td>\n",
       "        \n",
       "        <td>\n",
       "            <div class=\" \"\n",
       "                 title='D.F.'>D.F.\n",
       "            </div>\n",
       "        </td>\n",
       "        \n",
       "        <td>\n",
       "            <div class=\" \"\n",
       "                 title='66'>66\n",
       "            </div>\n",
       "        </td>\n",
       "        \n",
       "    </tr>\n",
       "    \n",
       "    <tr>\n",
       "        \n",
       "        <td>\n",
       "            <div class=\" \"\n",
       "                 title='['Distriro&#8901;Federal']'>['Distriro&#8901;Federal']\n",
       "            </div>\n",
       "        </td>\n",
       "        \n",
       "        <td>\n",
       "            <div class=\" \"\n",
       "                 title='1'>1\n",
       "            </div>\n",
       "        </td>\n",
       "        \n",
       "        <td>\n",
       "            <div class=\" \"\n",
       "                 title='Distriro&#8901;Federal'>Distriro&#8901;Federal\n",
       "            </div>\n",
       "        </td>\n",
       "        \n",
       "        <td>\n",
       "            <div class=\" \"\n",
       "                 title='259'>259\n",
       "            </div>\n",
       "        </td>\n",
       "        \n",
       "    </tr>\n",
       "    \n",
       "    <tr>\n",
       "        \n",
       "        <td>\n",
       "            <div class=\" \"\n",
       "                 title='['Distrito&#8901;Federal',&#8901;'DISTRITO&#8901;FEDERAL',&#8901;'distrito&#8901;federal']'>['Distrito&#8901;Federal',&#8901;'DISTRITO&#8901;FEDERAL',&#8901;'distrito&#8901;federal']\n",
       "            </div>\n",
       "        </td>\n",
       "        \n",
       "        <td>\n",
       "            <div class=\" \"\n",
       "                 title='3'>3\n",
       "            </div>\n",
       "        </td>\n",
       "        \n",
       "        <td>\n",
       "            <div class=\" \"\n",
       "                 title='Distrito&#8901;Federal'>Distrito&#8901;Federal\n",
       "            </div>\n",
       "        </td>\n",
       "        \n",
       "        <td>\n",
       "            <div class=\" \"\n",
       "                 title='11930'>11930\n",
       "            </div>\n",
       "        </td>\n",
       "        \n",
       "    </tr>\n",
       "    \n",
       "    <tr>\n",
       "        \n",
       "        <td>\n",
       "            <div class=\" \"\n",
       "                 title='['Mexico&#8901;D.F.',&#8901;'México&#8901;D.F.']'>['Mexico&#8901;D.F.',&#8901;'México&#8901;D.F.']\n",
       "            </div>\n",
       "        </td>\n",
       "        \n",
       "        <td>\n",
       "            <div class=\" \"\n",
       "                 title='2'>2\n",
       "            </div>\n",
       "        </td>\n",
       "        \n",
       "        <td>\n",
       "            <div class=\" \"\n",
       "                 title='Mexico&#8901;D.F.'>Mexico&#8901;D.F.\n",
       "            </div>\n",
       "        </td>\n",
       "        \n",
       "        <td>\n",
       "            <div class=\" \"\n",
       "                 title='2495'>2495\n",
       "            </div>\n",
       "        </td>\n",
       "        \n",
       "    </tr>\n",
       "    \n",
       "    </tbody>\n",
       "</table>\n",
       "\n",
       "\n",
       "<div class=\"info_items\">Viewing 5 of 5 rows / 4 columns</div>\n",
       "<div class=\"info_items\">1 partition(s)</div>\n"
      ],
      "text/plain": [
       "<IPython.core.display.HTML object>"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Loading page (1/2)\n",
      "Rendering (2/2)                                                    \n",
      "Done                                                               \n"
     ]
    },
    {
     "data": {
      "text/html": [
       "<img src='images/table12.png'>"
      ],
      "text/plain": [
       "<IPython.core.display.HTML object>"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    }
   ],
   "source": [
    "df_dc = dc.levenshtein_cluster(df,\"STATE\")\n",
    "df_dc.table()\n",
    "df_dc.table_image(\"images/table12.png\")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 52,
   "metadata": {},
   "outputs": [
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "INFO:optimus:Using 'column_exp' to process column 'STATE_FINGERPRINT' with function _trim\n",
      "INFO:optimus:Using 'column_exp' to process column 'STATE_FINGERPRINT' with function _lower\n",
      "INFO:optimus:Using 'pandas_udf' to process column 'STATE_FINGERPRINT' with function multiple_replace\n",
      "INFO:optimus:Using 'pandas_udf' to process column 'STATE_FINGERPRINT' with function _remove_accents\n",
      "INFO:optimus:Using 'pandas_udf' to process column 'STATE_FINGERPRINT' with function _split_sort_remove_join\n",
      "INFO:optimus:Using 'column_exp' to process column 'STATE_FINGERPRINT' with function _trim\n",
      "INFO:optimus:Using 'column_exp' to process column 'STATE_FINGERPRINT' with function _lower\n",
      "INFO:optimus:Using 'pandas_udf' to process column 'STATE_FINGERPRINT' with function multiple_replace\n",
      "INFO:optimus:Using 'pandas_udf' to process column 'STATE_FINGERPRINT' with function _remove_accents\n",
      "INFO:optimus:Using 'pandas_udf' to process column 'STATE_FINGERPRINT' with function _split_sort_remove_join\n"
     ]
    },
    {
     "data": {
      "text/plain": [
       "[{'STATE_CLUSTER': ['Estado de México'],\n",
       "  'STATE_CLUSTER_SIZE': 1,\n",
       "  'STATE_RECOMMENDED': 'Estado de México',\n",
       "  'STATE_COUNT': 810},\n",
       " {'STATE_CLUSTER': ['D.F.'],\n",
       "  'STATE_CLUSTER_SIZE': 1,\n",
       "  'STATE_RECOMMENDED': 'D.F.',\n",
       "  'STATE_COUNT': 66},\n",
       " {'STATE_CLUSTER': ['Distriro Federal'],\n",
       "  'STATE_CLUSTER_SIZE': 1,\n",
       "  'STATE_RECOMMENDED': 'Distriro Federal',\n",
       "  'STATE_COUNT': 259},\n",
       " {'STATE_CLUSTER': ['Distrito Federal',\n",
       "   'DISTRITO FEDERAL',\n",
       "   'distrito federal'],\n",
       "  'STATE_CLUSTER_SIZE': 3,\n",
       "  'STATE_RECOMMENDED': 'Distrito Federal',\n",
       "  'STATE_COUNT': 11930},\n",
       " {'STATE_CLUSTER': ['Mexico D.F.', 'México D.F.'],\n",
       "  'STATE_CLUSTER_SIZE': 2,\n",
       "  'STATE_RECOMMENDED': 'Mexico D.F.',\n",
       "  'STATE_COUNT': 2495}]"
      ]
     },
     "execution_count": 52,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "dc.to_json(df, \"STATE\")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Machine Learning \n",
    "\n",
    "Machine Learning is one of the last steps, and the goal for most Data Science WorkFlows.\n",
    "\n",
    "Apache Spark created a library called MLlib where they coded great algorithms for Machine Learning. Now\n",
    "with the ML library we can take advantage of the Dataframe API and its optimization to create Machine Learning Pipelines easily.\n",
    "\n",
    "Even though this task is not extremely hard, it is not easy. The way most Machine Learning models work on Spark\n",
    "are not straightforward, and they need lots of feature engineering to work. That's why we created the feature engineering\n",
    "section inside Optimus."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "One of the best \"tree\" models for machine learning is Random Forest. What about creating a RF model with just\n",
    "one line? With Optimus is really easy."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 53,
   "metadata": {},
   "outputs": [
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "INFO:optimus:Downloading data_cancer.csv from https://raw.githubusercontent.com/ironmussa/Optimus/master/tests/data_cancer.csv\n",
      "INFO:optimus:Downloaded 125205 bytes\n",
      "INFO:optimus:Creating DataFrame for data_cancer.csv. Please wait...\n",
      "INFO:optimus:Successfully created DataFrame for 'data_cancer.csv'\n"
     ]
    }
   ],
   "source": [
    "df_cancer = op.load.csv(\"https://raw.githubusercontent.com/ironmussa/Optimus/master/tests/data_cancer.csv\")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 54,
   "metadata": {},
   "outputs": [],
   "source": [
    "columns = ['diagnosis', 'radius_mean', 'texture_mean', 'perimeter_mean', 'area_mean', 'smoothness_mean',\n",
    "           'compactness_mean', 'concavity_mean', 'concave points_mean', 'symmetry_mean',\n",
    "           'fractal_dimension_mean']\n",
    "\n",
    "df_predict, rf_model = op.ml.random_forest(df_cancer, columns, \"diagnosis\")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "This will create a DataFrame with the predictions of the Random Forest model.\n",
    "\n",
    "So lets see the prediction compared with the actual label:\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 55,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Loading page (1/2)\n",
      "Rendering (2/2)                                                    \n",
      "Done                                                               \n"
     ]
    },
    {
     "data": {
      "text/html": [
       "<img src='images/table13.png'>"
      ],
      "text/plain": [
       "<IPython.core.display.HTML object>"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    }
   ],
   "source": [
    "df_predict.cols.select([\"label\",\"prediction\"]).table_image(\"images/table13.png\")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "The rf_model variable contains the Random Forest model for analysis.\n",
    " \n",
    "## Contributing to Optimus\n",
    "Contributions go far beyond pull requests and commits. We are very happy to receive any kind of contributions   \n",
    "including:  \n",
    "  \n",
    "* [Documentation](https://github.com/ironmussa/Optimus/tree/master/docs/source) updates, enhancements, designs, or   bugfixes.  \n",
    "* Spelling or grammar fixes.  \n",
    "* README.md corrections or redesigns.  \n",
    "* Adding unit, or functional [tests](https://github.com/ironmussa/Optimus/tree/master/tests)   \n",
    "* Triaging GitHub issues -- especially determining whether an issue still persists or is reproducible.  \n",
    "* [Searching #optimusdata on twitter](https://twitter.com/search?q=optimusdata) and helping someone else who needs help.  \n",
    "* [Blogging, speaking about, or creating tutorials](https://hioptimus.com/category/blog/)   about Optimus and its many features.  \n",
    "* Helping others on [Discord](https://img.shields.io/discord/579030865468719104.svg)    \n",
    "  \n",
    "## Backers  \n",
    "[[Become a backer](https://opencollective.com/optimus#backer)] and get your image on our README on Github with a link to your site.  \n",
    "[![OpenCollective](https://opencollective.com/optimus/backers/badge.svg)](#backers)   "
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Sponsors  \n",
    "[[Become a sponsor](https://opencollective.com/optimus#backer)] and get your image on our README on Github with a link to your site.  \n",
    "[![OpenCollective](https://opencollective.com/optimus/sponsors/badge.svg)](#sponsors)  \n",
    "  \n",
    "## Core Team\n",
    "Argenis Leon and Luis Aguirre\n",
    "\n",
    "## License:  \n",
    "  \n",
    "Apache 2.0 © [Iron](https://github.com/ironmussa)  \n",
    "  \n",
    "[![Logo Iron](https://iron-ai.com/wp-content/uploads/2017/08/iron-svg-2.png)](https://ironmussa.com)  \n",
    "  \n",
    "<a href=\"https://twitter.com/optimus_data\"><img src=\"https://www.shareicon.net/data/256x256/2015/09/01/94063_circle_512x512.png\" alt=\"Optimus twitter\" border=\"0\" height=\"60\"></a>"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Post-process readme script. Always run this if you modify the notebook. \n",
    "\n",
    "This will recreate README.md"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "The bellow script process the ```readme_.md``` that is ouputed from this notebook and remove the header from jupytext, python comments and convert/add table to images and output ```readme.md```.\n",
    "\n",
    "To make ```table_image()``` function be sure to install imagekit ```pip install imgkit```\n",
    "Also install wkhtmltopdf https://wkhtmltopdf.org/downloads.html. This is responsible to generate the optimus tables as images"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 37,
   "metadata": {},
   "outputs": [],
   "source": [
    "from shutil import copyfile\n",
    "output_file = \"../README.md\"\n",
    "copyfile(\"readme_.md\", output_file)\n",
    "\n",
    "import sys\n",
    "import fileinput\n",
    "import re\n",
    "\n",
    "pattern = r'\"([A-Za-z0-9_\\./\\\\-]*)\"'\n",
    "\n",
    "jupytext_header = False\n",
    "flag_remove = False\n",
    "\n",
    "remove = [\"load_ext\", \"autoreload\",\"import sys\",\"sys.path.append\"]\n",
    "\n",
    "buffer = None\n",
    "for i, line in enumerate(fileinput.input(output_file, inplace=1)):\n",
    "    done= False\n",
    "    try:\n",
    "        # Remove some helper lines\n",
    "        for r in remove:\n",
    "            if re.search(r, line):\n",
    "                done= True\n",
    "        \n",
    "        #Remove the post process code\n",
    "        if re.search(\"Post-process\", line):\n",
    "            flag_remove = True\n",
    "            \n",
    "        if flag_remove is True:\n",
    "            done = True        \n",
    "            \n",
    "        \n",
    "        # Remove jupytext header\n",
    "        if jupytext_header is True:\n",
    "            done = True\n",
    "            \n",
    "        if  \"---\\n\" == line: \n",
    "            jupytext_header = not jupytext_header      \n",
    "                    \n",
    "        elif done is False:\n",
    "     \n",
    "            # Replace .table_image(...) by table()\n",
    "            chars_table=re.search(\".table_image\", line)\n",
    "            chars_image=re.search(\".to_image\", line)\n",
    "            chars_plot = True if len(re.findall('(.plot.|output_path=)', line))==2 else False\n",
    "            \n",
    "            \n",
    "            \n",
    "            path = \"readme/\"\n",
    "            if chars_table:\n",
    "                print(line[0:int(chars_table.start())]+\".table()\")\n",
    "\n",
    "                m = re.search(r'table_image\\(\"(.*?)\"\\)', line).group(1)\n",
    "                if m:\n",
    "                    buffer = \"![](\"+ path + m + \")\"              \n",
    "            elif chars_image:\n",
    "                m = re.search(r'to_image\\(output_path=\"(.*?)\"\\)', line).group(1)\n",
    "                if m:\n",
    "                    buffer = \"![](\"+ path + m + \")\"  \n",
    "            elif chars_plot:\n",
    "\n",
    "                m = re.search('output_path=\"(.*?)\"', line).group(1)\n",
    "\n",
    "                if m:\n",
    "                    buffer = \"![](\"+ path + m + \")\"  \n",
    "            \n",
    "            else:\n",
    "                sys.stdout.write(line)\n",
    "                \n",
    "            if \"```\\n\"==line and buffer:                \n",
    "                print(buffer)\n",
    "                buffer = None\n",
    "                \n",
    "    except Exception as e:\n",
    "        print(e)\n",
    "        \n",
    "fileinput.close()\n",
    "\n",
    "\n",
    "# Remove empyt python cells\n",
    "flag = False\n",
    "for i, line in enumerate(fileinput.input(output_file, inplace=1)):\n",
    "   \n",
    "    if re.search(\"```python\", line):     \n",
    "        flag = True\n",
    "    elif re.search(\"```\", line) and flag is True:\n",
    "        flag=False\n",
    "    elif flag is True:\n",
    "        flag = False\n",
    "        print(\"```python\")\n",
    "        print(line,end=\"\")\n",
    "    else:\n",
    "        print(line, end=\"\")\n",
    "                    \n",
    "        \n",
    "fileinput.close()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 38,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "images/profiler.png\n"
     ]
    }
   ],
   "source": [
    "line = 'op.profiler.to_image(output_path=\"images/profiler.png\")\")'\n",
    "m = re.search(r'to_image\\(output_path=\"(.*?)\"\\)', line).group(1)\n",
    "print(m)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": []
  }
 ],
 "metadata": {
  "jupytext": {
   "formats": "ipynb,md"
  },
  "kernelspec": {
   "display_name": "Python 3",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.7.6"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 4
}
